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1  Introduction 


KEVYN,  I'M  TAKING  SCHLOCK 
AND  ELF  ON  A  THREE-DAY  OP. 
YOU'RE  IN  CHARGE. 


DON'T  LET  THE  TROOPS  GET  INTO 
ANY  TROUBLE  YOU  CAN'T  GET  BACK 
OUT  OF  WITH  REASONABLE  BRIBES 
OR  ZERO  FRIENDLY  CASUALTIES. 


CAN  YOU  GIVE  ME  A 
[  RANGE  OF  VALUES  FOR 
“REASONABLE"  AND 
“FRIENDLY?" 


m 


□ns  little 
hnala-5haped 
superinfelligence 
mpves  in,  and 
suddenly  he’s 
pueattpning  my 
judgement. 


Figure  1:  “Schlock  Mercenary,”  ©  2000-2006,  The  Tayler  Corporation,  All  Rights  Reserved.  Used 

with  permission. 


As  shown  above  in  Figure  1,  mercenary  captain  Kaff  Tagon  has  a  problem.  He 
needs  to  select  an  officer  to  command  his  troops  in  his  absence.  He  chooses  his  human 
commander,  Kevyn,  and  leaves  verbal  orders:  “Don’t  let  the  troops  get  into  any  trouble 
you  can’t  get  back  out  of  with  reasonable  bribes  or  zero  friendly  casualties.”  Kevyn, 
more  an  engineer  than  a  commander,  asks  for  more  specifics  -  a  “range  of  values  for 
‘reasonable’  and  ‘friendly.’”  Unsure  of  what  judgments  his  captain  would  have  him 
make,  he  asks  Tagon  to  quantify  his  order.  In  the  last  panel,  Tagon  reveals  that  he  is 
unwilling  or  unable  to  make  his  orders  any  more  specific,  and  that  if  he  could  assign 
numbers  to  the  judgment  call,  he  would  have  left  the  ship’s  artificial  intelligence  agent  in 
charge. 

This  comic  strip  neatly  summarizes  a  dichotomy  that  points  to  a  real  problem  in 
automated  systems  research.  Both  potential  command  candidates  -  the  human  and  the 
A.I.  agent  -  would  like  the  captain  to  express  his  orders  in  a  crisper,  more  quantifiable 
fashion.  This  is  not  an  unreasonable  request  for  either  to  make,  especially  since  the 
captain  will  be  very  displeased  indeed  if  company  money  is  wasted  on  “unreasonable” 
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bribes  or  “friendly”  casualties  are  not  kept  to  zero.  But  the  captain  cannot  give  numbers 
to  either  -  and  only  trusts  the  human  decision-maker  to  be  able  to  properly  function 
without  them.  The  A. I.  agent  is  assumed  to  lack  the  ability  to  translate  this  vague  order 
into  a  policy  for  overseeing  the  mercenary  troops  in  the  captain’s  absence.  But  we  might 
like  to  have  a  system  that  can  do  exactly  that.  We  would  like  for  it  to  understand  our 
priorities  and  expectations  without  having  to  specify  them  as  numbers  that  may  be 
counter-intuitive  or  even  inaccurate. 

We  can,  for  example,  easily  identify  “aggressive  driving”  when  we  see  it  on  the 
roads.  An  aggressive  driver’s  behavior  is  marked  by  high  traveling  speeds,  frequent  lane 
changes,  sudden  accelerations,  and  the  maintenance  of  the  very  barest  of  safety  margins 
between  other  vehicles.  What  is  a  “high  traveling  speed?”  Even  once  the  context  is  fixed 
(e.g.,  interstate  vs.  in-town),  the  linguistic  term  has  some  fuzziness  to  it.  Certainly,  it 
implies  a  speed  higher  than  the  legal,  posted  speed  limit.  It  probably  means  a  speed 
higher  than  the  average  speed  of  the  other  drivers.  But  is  someone  driving  5  m.p.h.  faster 
than  road  speed  an  aggressive  driver?  And  on  the  other  extreme,  is  there  a  speed  so  high 
that  we  can  say  that  we  have  gone  past  “aggressive  driving”  and  are  into  a  region  of 
“reckless  driving?”  When,  exactly,  is  that  line  crossed?  The  answers  to  these  questions 
are  easy  for  humans  to  intuit,  but  difficult  to  formalize. 

These  concerns  follow  us  into  the  realm  of  trajectory  optimization.  Robot 
trajectories  do  not  have  to  be  optimal.  In  some  domains,  we  may  accept  satisficing 
trajectories  that  simply  get  the  job  done  without  capsizing  the  robot  or  running  it  into 
walls.  But  in  the  space  domain  in  particular,  we  will  always  be  concerned  with 
conserving  precious  fuel  and  power.  Even  if  we  want  an  “aggressive,  fast”  trajectory,  we 
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will  still  want  it  to  be  the  most  fuel  efficient  aggressive  trajectory.  We  are  concerned 
with  fuel  even  if  the  result  is  not  the  fuel-optimal  solution. 

Trajectory  optimization,  at  its  most  general,  will  have  multiple  objectives  and 
constraints.  Multiple  objectives  in  particular  give  rise  to  multiple  possible  solutions. 
Consider  a  two  objective  case:  we  desire  to  save  both  time  and  fuel.  These  objectives 
compete  with  one  another.  The  most  fuel-efficient  trajectory  is  rarely  the  most  time- 
efficient  trajectory,  and  visa  versa.  There  might  be  several  solutions  that  take  the  same 
minimum  time,  and  we  would  be  interested  in  the  most  fuel-efficient  one.  Or,  we  might 
examine  all  of  the  minimum  fuel  solutions  and  pick  the  fastest  of  those.  Or,  we  might 
want  some  solution  in  between  -  one  that  is  neither  the  fastest  nor  the  most  fuel-efficient, 
but  balances  the  two  objectives  in  some  fashion.  Starting  out,  we  may  have  some  idea  of 
the  kind  of  solution  we  want  or  of  the  relative  importance  of  saving  time  or  fuel.  How 
can  we  communicate  those  preferences  to  the  numeric  solver  that  will  compute  the 
optimal  trajectory,  so  that  it  will  find  the  “right”  optimal? 

In  addition  to  these  fuzzy  preference  ideas,  we  may  also  have  constraints  to  place 
on  the  trajectory.  The  generated  trajectory  must  obey  the  system’s  dynamics,  and  it  must 
avoid  collisions.  Beyond  that,  we  might  further  impose  limits  on  either  the  control  inputs 
(to  reflect  the  realities  of  travel  limits  or  thruster  saturation)  and  on  the  state  (as  safety 
measures).  Some  of  these  limits  could  be  “soft,”  like  a  posted  speed  limit.  It’s  a  good 
idea  to  obey  the  speed  limit,  but  circumstances  might  require  one  to  exceed  it  (e.g.,  to 
avoid  an  erratic  driver).  Other  limits  are  “hard.”  Astronauts  will  start  blacking  out  if 
their  re-entry  capsule  exceeds  acceleration  limits. 
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There  are  many  approaches  to  solving  the  constrained  multi-objective 
optimization  problem  [1],  some  of  which  -  evolutionary  algorithms,  mixed  integer  linear 
programming,  and  optimal  control  theory  -  will  be  reviewed  in  Chapter  2.  To  a  greater 
or  lesser  extent,  they  all  can  handle  soft  and  hard  constraints.  All,  however,  include  an 
iterative  refinement  loop  to  find  those  solutions  that  correspond  to  user  preference:  that 
is,  to  find  the  “optimal  optimal”  solution.  And  in  all  of  the  papers  reviewed,  it  was  tacitly 
understood  that  a  human  user  would  be  interacting  directly  with  the  optimization 
algorithm  in  that  loop,  injecting  preference  information  to  focus  the  optimization  on  the 
areas  of  interest  to  the  user. 

This  work  seeks  to  take  the  human  user  out  of  that  loop  as  much  as  possible. 
Before  optimization  ever  begins,  classes  of  motion  are  typified  with  linguistic 
expressions:  aggressive,  curious,  careful.  Fuzzy  logic  [2]  is  an  appropriate  tool  for 
approaching  the  problem  of  translating  natural  language  utterances  into  numeric  terms 
[3].  The  words  are  correlated  to  fuzzy  state  values  that  the  system  estimates  that  the  user 
expects  to  see  in  the  resulting  trajectory:  the  “numbers  for  it”  that  we  need  to  leave  an 
A.I.  in  charge.  Some  iteration  may  be  required  here  to  ensure  that  the  user’s  expectation 
matches  the  fuzzy  definition  of  the  linguistic  expressions.  However,  once  that  process  is 
complete,  the  user  can  interact  with  the  optimal  trajectory  generator  in  a  much  more 
hands-off  fashion. 

This  work  considers  a  planetary  rover  and  an  Earth-observing  satellite  as 
motivating  examples.  The  planetary  rover  example  is  a  very  simplified  case  with  two 
degrees  of  freedom  and  linear  dynamics  that  provided  initial  insight  into  the  trajectory 
generation  and  modification  problem.  The  satellite  case  has  more  sophisticated  and 
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realistic  dynamics  and  all  six  degrees  of  freedom.  The  satellite’s  hypothetical  job  is  to 
provide  imaging  to  support  ground-based  decision  making.  It  can  execute  fuel  bums  to 
change  its  orbit  in  response  to  user  demands  on  the  ground.  These  user  demands  may 
have  varying  levels  of  urgency.  Some  may  be  matters  of  some  curiosity,  but  no  urgency 
at  all,  and  the  satellite  is  free  to  execute  maneuvers  whenever  it  is  most  fuel-efficient  to 
do  so.  It  may  also  need  to  maneuver  around  other  space  objects  (perhaps  other  satellites 
that  require  observation). 

This  work  proposes  an  architecture  that  can  intelligently  meet  these  demands.  A 
cognitively-inspired  expert  system  moderates  the  trajectory  generation  and  optimization 
process.  At  initialization,  a  solution  technique  is  selected  based  on  problem 
characteristics.  If  an  initial  trajectory  estimate  is  required  for  the  solution  technique,  one 
is  generated,  again  with  consideration  for  the  problem  characteristics.  Finally,  expressed 
user  preferences  are  transformed  via  fuzzy  methods  into  an  initial  set  of  weights  or  other 
parameters,  and  the  selected  solution  technique  is  run. 

The  expert  system  also  considers  the  results  of  the  solution.  Often  in  these 
problems,  one  or  more  user-defined  constraints  or  preferences  will  not  be  met  after  the 
first  iteration.  Making  changes  to  the  weight  vector  or  to  other  parameters  may  solve  the 
problem;  so  may  a  different  initial  trajectory  estimate  or  the  use  of  a  different  solution 
technique  (e.g.,  if  the  problem  will  not  solve  using  the  first  technique).  Given  its 
knowledge  base  and  the  current  history  of  repair  attempts  for  this  problem,  the  expert 
system  continues  to  search  for  an  appropriate  trajectory. 
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1.1  Contributions 


This  dissertation  describes  an  architecture  for  organizing  trajectory  generation  and 
optimization  tools  into  an  overall  unit  that  allows  optimization  with  respect  to  poorly 
defined,  fuzzy  user  preferences  as  well  as  respecting  hard  limits  or  boundaries  placed  on 
the  results  (whether  these  are  a  result  of  system  dynamics  or  are  also  matters  of  user 
preference).  This  problem  is  challenging  because  it  seeks  to  impose  an  automated, 
formalized  framework  around  a  process  that  is  usually  accomplished  by  many  man-hours 
of  trial  and  error.  Suggesting  such  a  framework  and  showing  that  it  is  useful  in  both 
linear  and  nonlinear  dynamical  systems  is  one  contribution  of  this  work.  Finding  the 
rules  and  techniques  that  would  allow  such  automation,  and  showing  their  generality,  is 
another  contribution. 

1.2  Organization 

Chapter  2  introduces  work  related  to  this  research.  As  the  proposed  architecture 
draws  from  a  wide  variety  of  sources,  including  traditional  optimization  methods, 
cognitive  modeling,  and  fuzzy  set  theory,  Chapter  2  is  wide-ranging.  Chapter  3  presents 
the  architecture  in  more  detail.  Chapter  4  presents  the  first  implementation  of  the 
architecture  and  the  results  of  the  two  degree  of  freedom  rover  model.  The  extension  to  a 
6DOF  space  satellite,  and  the  changes  this  required,  are  given  in  Chapter  5.  Chapter  6 
concludes  with  a  summary  and  review  of  the  contributions  of  this  research  and  explores 
future  avenues  of  research. 
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2  Related  Work 


This  work  incorporates  results  from  many  fields.  This  chapter  starts  with  a  look 
at  some  of  the  natural  language  literature  dealing  with  position  and  motion.  Then,  it  will 
consider  the  traditional  robotics  approaches  to  motion  planning  and  identify  the  primary 
shortcomings  there.  The  different  approaches  to  solving  multi-objective  optimization 
problems  are  next  addressed,  and  their  fitness  for  solving  trajectory  optimization 
problems  in  particular  is  examined.  Finally,  fuzzy  set  theory  is  introduced,  and  the 
literature  linking  it  both  to  natural  language  and  to  optimization  is  explored. 

2.1  Natural  Language 

Verbal  or  written  instructions  are  one  possible  mode  of  interaction  between  a 
human  and  a  robot  (e.g.,  as  in  [4]).  Decoding  the  meaning  of  these  utterances  is  the 
provenance  of  natural  language  processing.  With  respect  to  motion  words:  “some 
languages  like  English  regularly  encode  manner  of  motion  in  verbs,  such  as  ‘swagger,’ 
‘slink,’  ‘slide,’  and  ‘sway’...  Choice  of  verb  is  open  to  construal.”  [5].  That  is,  the 
speaker  has  a  variety  of  choices  to  describe  how  a  route  from  a  location  A  to  B  is 
traversed.  What  verbs  the  speaker  selects  will  indicate  to  some  extent  the  “manner  of 
motion”  the  speaker  requires  of  the  robot. 

To  what  extent?  There  is  no  appreciable  literature  dealing  with  the  transformation 
of  verbs  to  numbers.  There  is,  however,  a  literature  on  assigning  numeric  values  to 
spatial  expressions  such  as  “near”  or  “in  front  of’  [6,  7,  8],  Researchers  use,  among  other 
techniques,  a  potential  field  (first  developed  for  robotic  path  planning  [9])  as  a 
membership  function  in  the  fuzzy  set  theory  sense;  indeed,  fuzzy  set  theory  terms  like 
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“crisp”  and  “scruffy”  appear  frequently.  Essentially,  one  point  or  line  is  selected  (by  the 
researchers)  to  represent  the  ideal  of  “near,”  “along,”  or  “in  front  of’  some  object  in  the 
space.  This  becomes  the  minimum  for  the  potential  field,  which  can  be  visualized  as 
something  like  a  bowl.  The  bottom  of  the  bowl  (the  minimum  of  the  potential  field)  is 
located  at  the  location  of  the  ideal  representation  of  “near,”  “along,”  etc.  As  one  moves 
farther  away  from  that  point,  line,  or  region,  the  bowl  slopes  up.  The  potential  field 
returns  higher  values  the  farther  away  from  the  minimum  one  moves.  In  this  way,  the 
field  generates  a  value  for  how  much  “nearness”  or  “in  frontness”  any  coordinate  in  the 
space  possesses.  The  smaller  the  generated  value,  the  fairer  it  is  to  apply  the  spatial 
expression  to  it.  At  sufficiently  large  values,  the  spatial  expression  can  be  judged  entirely 
false  (e.g.,  something  behind  a  desk  is  in  no  way  in  front  of  it). 

This  work  extends  this  idea  to  motion  words.  First,  we  define  a  “state  feature 
space”  composed  of  state  features  such  as  average  forward  velocity  and  maximum 
acceleration.  A  collection  of  points  in  state  feature  space  is  taken  (by  the  researchers)  as 
the  ideal  representation  of  a  verb  or  adverb/verb  pair,  like  “jog”  or  “move  stealthily.” 
Fuzzy  membership  functions  are  then  defined  around  these  areas,  so  that  similar  but  not 
identical  kinds  of  motion  can  still  be  included  in  these  classifications.  This  gives  some 
flexibility  when  trying  to  satisfy  the  multiple  constraints  and  objectives  such  terms  imply 
while  maintaining  the  user’s  preference  for  motion  type. 

Motion  type  is  unavoidably  somewhat  domain-dependent  when  it  comes  to 
translating  words  to  actual  numeric  values.  We  may  define  “quickly,”  for  instance,  as 
having  “high  average  speed”  but  exactly  how  fast  that  is  will  depend  not  only  on  the 
dynamic  agent  but  also  possibly  on  its  environment.  “High  speed”  for  a  human  runner  is 
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different  from  “high  speed”  for  a  car;  similarly,  a  car  going  “too  fast”  through  a 
residential  area  is  in  absolute  terms  of  velocity  probably  traveling  at  a  speed  that  would 
be  “too  slow”  on  the  highway.  Our  definitional  databases  for  each  domain  will  store 
these  domain-based  differences. 

2.2  Path  Planning  and  Traversal 

The  task  of  moving  a  robot  from  a  start  location  to  a  goal  location  has  been  much- 
studied.  Path  planning  focuses  on  finding  a  path  through  free  space  for  the  robot  to 
follow  [10].  The  robot’s  dynamics  are  not  generally  considered  at  all  for  holonomic 
robots.  For  nonholonomic  robots,  dynamic  constraints  that  directly  affect  path,  such  as  a 
turning  radius,  may  be  used  to  reject  some  paths.  When  following  the  path,  the  robot  is 
typically  pre-programmed  with  a  simple  trapezoidal  velocity  profile.  It  accelerates  at  a 
pre-programmed  rate  to  achieve  its  traveling  speed,  and  then  it  maintains  that  speed  until 
an  obstacle  or  the  goal  requires  it  to  smoothly  decelerate.  The  rates  of  acceleration  and 
the  traveling  speed  are  set  well  within  the  robot’s  operating  parameters,  so  no  impossible 
demands  will  be  made  of  the  motors,  and  so  that  emergency  stops  will  have  a  very 
minimal  stopping  distance.  For  slow,  wheeled  robots,  especially  those  in  a  laboratory  or 
office  environment,  this  model  usually  suffices  to  move  the  robot  around. 

A  step  beyond  this  simple  model  lies  behavior-based  motion  control  [11].  Here, 
environmental  cues  trigger  one  of  a  suite  of  pre-programmed  responses.  The  resulting 
actions  can  seem  quite  sophisticated  or  even  emotional  [12].  A  robot  that  is  “frightened” 
of  red  lights  can  be  programmed  to  respond  to  that  stimulus  by  moving  away  from  red 
lights  very  quickly.  The  same  robot  might  also  be  programmed  to  approach  green  lights 
very  slowly,  giving  it  an  air  of  cautious  interest.  However,  the  behaviors  “flee  from  red 
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light”  and  “investigate  green  light”  and  the  parameters  that  define  them  are  typically 
fixed,  selected  either  by  human  researchers  or  machine  learning  techniques  [13]. 

Although  some  research  has  been  done  into  making  the  parameters  that  define  these 
behaviors  adaptive  to  environmental  stimuli,  they  are  still  reactive  in  nature,  making  any 
sort  of  overall  optimization  impossible  [14,  15]. 

Whereas  most  behavior-based  robotics  can  be  modeled  with  some  kind  of  Markov 
decision  process  (MDP),  another  branch  of  research  looks  at  using  a  hybrid  dynamical 
system  approach  [16,  17].  In  a  hybrid  dynamical  system,  discrete  events  trigger  shifts 
between  different  continuous  dynamics,  e.g.,  reaching  a  fill  line  (a  discrete  event)  causes 
a  shift  in  water  flow  rate  into  a  tank  (continuous  system  dynamics).  In  the  cited  work,  a 
simulated  mouse  agent  adjusts  its  trajectory  in  response  to  the  environment.  As  in  this 
work,  this  is  accomplished  via  changing  weights.  The  weights,  however,  are  weights  on 
the  repelling  and  attracting  potential  functions  used  for  local  navigation  -  there  is  no 
possibility  of  global  optimization.  Further,  all  the  weight  changes  are  based  on  a  single 
“comfort”  parameter  and  are  linked  directly  to  speed.  The  longer  the  mouse  has  gone 
without  encountering  a  nearby  obstacle,  the  more  “comfortable”  it  becomes  and  the  faster 
it  moves.  Its  level  of  aggression  (e.g.,  how  closely  it  will  approach  obstacles)  is  also 
linked  directly  to  this  comfort  parameter.  While  a  hybrid  dynamical  systems  approach 
provides  an  excellent  way  to  realistically  animate  computer  agents  reacting  to  objects  in 
their  world  (which  is  its  goal),  it  does  not  yet  offer  the  ability  to  meet  constraints  or 
optimize  cost  objectives;  this  is  an  open  research  area. 

“Programming  by  reward”  is  a  technique  that  elicits  different  dynamic  behaviors 
from  a  system  [18].  Like  our  research,  it  uses  preference  information  to  create  these 
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differences.  Unlike  our  research,  it  injects  the  preference  infonnation  into  machine 
learning  algorithms  for  the  development  of  motion  behaviors.  These  behaviors  can  even 
form  an  optimal  policy,  given  preferences.  However,  the  “interiors”  of  the  behaviors  are 
still  black  boxes.  The  number  of  lane  changes  in  the  authors’  driving  example  can  be 
optimized  for  a  safe  driver  and  for  a  reckless  driver,  but  the  dynamics  of  that  lane  change 
are  unexamined.  Our  research  is  especially  interested  in  controlling  the  low-level  inputs 
that  result  in  the  desired  behaviors,  rather  than  assembling  pre-typed  behaviors  into  a 
policy. 

It  is  no  coincidence  that  the  fastest  real-time  motion  planners  -  artificial  potential 
function  guidance  -  are  the  ones  with  the  least  capability  for  optimization  [9],  and  that 
the  most  flexible  optimization  routines  are  the  most  computationally  intensive.  The  high 
dimensionality,  nonlinearity,  constraints  and  multiple  objectives  all  combine  to  make 
trajectory  optimization  a  very  difficult  problem.  In  environments  that  are  highly 
“dynamic”  in  the  AI  community’s  sense  of  the  word  (that  is,  rapidly  and  unpredictably 
changing),  optimization  will  rarely  be  feasible.  Fast,  reactive  motion  planning  will 
undoubtedly  continue  to  play  a  role  in  robot  control.  But  in  the  space  domain, 
optimization  is  a  key  tool  in  developing  fuel-  and  time-efficient  trajectories  while 
ensuring  safety.  The  environment  is  in  most  cases  very  well-known,  and  the  current  and 
future  locations  of  obstacles  can  be  predicted  with  a  high  degree  of  accuracy  and 
certainty.  This  is  very  unlike  the  rapid  and  unpredictable  changes  encountered  by  robots 
operating  in  home,  office,  or  street  environments  on  Earth. 
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2.3  Kinodynamic  Planning 

Trajectory,  or  kinodynamic,  planning  is  a  newer  field.  Methods  based  on  the  idea 
of  velocity  obstacles  [19,  20]  are  conceptually  similar  to  path  planning  “roadmap” 
methods,  with  forbidden  velocities  (e.g.,  those  that  would  cause  collisions  or  exceed 
system  capabilities)  modeled  as  obstacles  in  a  velocity  space.  Spline  methods  [21,  22] 
decouple  the  path  and  trajectory  planning  process;  once  a  clear  path  through  space  is 
found,  interpolating  splines  are  used  to  find  smooth  trajectories  along  them.  Randomized 
kinodynamic  planning  [23]  explores  the  state  space  in  a  random  fashion,  working 
forward  from  the  start  state  and  backward  from  the  goal  state  until  the  search  trees  meet. 
All  of  these  methods  search  for  dynamically  feasible  trajectories;  none  of  them  by 
themselves  have  any  notion  of  optimality.  They  can  be  used  as  input  to  optimization 
routines,  however.  Velocity  obstacles  and  spline  methods  have  both  been  used  to 
generate  time  optimal  trajectories  [24,  25]  and  randomized  methods  are  often  used  as 
starting  points  for  linear  programming  methods  to  generate  fuel  optimal  trajectories  [26]. 

2.4  Multi-objective  Optimization 

At  its  most  general,  trajectory  optimization  is  a  multi-objective  problem  with 
constraints.  Both  the  multiple  objectives  and  the  constraints  complicate  the  solution  of 
the  problem,  which  is  already,  except  in  certain  very  specialized  cases  like  station¬ 
keeping,  nontrivial.  A  variety  of  approaches  have  been  developed  to  solve  multi¬ 
objective  optimization  problems,  and  they  will  be  surveyed  below.  While  the  different 
approaches  have  different  strengths  and  weaknesses,  they  all  have  one  thing  in  common: 
a  parameter  or  set  of  parameters  that  can  be  adjusted  to  reflect  the  user’s  priorities 
concerning  objectives.  This  common  problem  motivates  the  work  of  this  dissertation. 
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The  general  form  of  the  constrained  multi-objective  optimization  problem  is: 


minimize  (f, (x),  f2(x),  ...  ,  fk(x))  (2.1) 

subject  to  the  m  inequality  constraints 


gi(x)<  0  i  =  l,2,  ...,m 


and  the  p  equality  constraints 


(2.2) 


hj(x)  =  0  j  =  1,2,  ...,p  (2.3) 

where / are  the  k  objective  functions  ft  :  91”  — >  9?  and  x  =  [xi,  x2,  ...,  x„]T  is  the  vector  of 
decision  variables.  P  is  the  set  of  vectors  that  satisfy  Equations  2.2  and  2.3;  the  set  of 
optimal  vectors  x*  will  be  found  in  P. 


2.5  Evolutionary  Approaches 

Genetic  and  evolutionary  algorithms  (GAs  and  EAs)  have  become  popular  search 
and  optimization  tools  since  their  introduction  in  the  1970s  [27].  A  population  of 
potential  problem  solutions  is  generated  and  encoded  (in  binary  form  for  GAs,  and  in 
other  representations  for  EAs),  then  tested  against  some  fitness  function.  A  new 
population  is  then  fonned  from  the  first  by  selecting  the  “fittest”  individuals  for  survival 
and  “breeding”  them  by  combining  features  of  their  solutions  into  new  solutions.  The 
fittest  individuals  themselves  may  also  be  included  in  the  new  population.  Mutations  - 
small  changes  made  at  random  to  the  solution  -  also  increase  novelty  in  the  new 
population.  There  are  variations  on  this  pattern  of  EA  iteration,  but  this  is  the  basic 
technique. 
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Since  there  is  rarely  a  single  point  where  all  of  the  multiple  objectives  are 
simultaneously  maximized  or  minimized,  evolutionary  multi-objective  optimization 
(EMO)  frequently  makes  use  of  Pareto  optimality  [28,  29].  Given  a  set  of  k  objective 
functions/  a  vector  of  decision  variables  x*  e  P  is  Pareto  optimal  if  there  does  not  exist 
another  x  e  P  such  that  fi(x)<fi  ( x*)  for  all  i  =  1 ,  k  and  /.  (x)<  /.  ( x*J  for  at  least 

one  j.  In  other  words,  there  is  no  feasible  vector  x  which  would  decrease  some  criterion 
without  also  increasing  some  other  criterion.  The  set  of  vectors  x*  that  are  Pareto 
optimal  are  called  nondominated.  The  corresponding  set  of  Pareto  optimal  solutions  to 
the  objective  functions/is  called  the  Pareto  front. 

A  variety  of  survey  papers  review  the  state-of-the-art  in  EMO  research  from  the 
mid-1990s  to  the  present  [30,  31,  32],  and  a  comprehensive  website  with  over  1000  EMO 
references  provides  access  to  a  wealth  of  literature  [33].  Early  attempts  included  the 
Vector  Evaluated  Genetic  Algorithm  (VEGA)  [34],  which  has  a  tendency  to  find 
solutions  that  do  very  well  in  one  dimension,  but  not  “middling”  or  compromise 
candidates,  and  the  application  of  EA  heuristics  to  traditional  approaches  like  aggregation 
(combining  multiple  objectives  into  a  single  objective)  [35]  and  lexicographic  ordering 
(optimizing  the  objectives  in  turn,  beginning  with  the  most  important)  [36].  The  ideas  of 
Pareto  dominance  were  not  fully  utilized  at  this  time. 

The  primary  drawback  of  the  aggregation  approach,  so  common  in  traditional 
engineering  applications,  is  that  it  can  miss  areas  of  the  Pareto  optimal  front  if  the  front  is 
non-convex.  A  region  is  said  to  be  convex  if  it  always  contains  the  line  segment 
connecting  two  points  when  it  contains  the  two  points  themselves.  An  EA  that  scores 
members  of  a  population  based  on  their  values  for  each  objective/individually,  rather 
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than  the  aggregate  of  all  of  them,  does  not  need  to  assume  a  convex  Pareto  front  to 
explore  all  of  it.  More  recent  research  has  concentrated  on  ways  to  encourage  population 
diversity,  so  that  the  EA  will  explore  the  entire  Pareto  front,  and  the  addition  of  elitism, 
which  usually  uses  an  external  file  to  store  nondominated  individuals  so  that  they  will  not 
be  lost.  (The  ways  in  which  the  members  of  the  external  population  interact  with  the  rest 
of  the  population  varies  from  algorithm  to  algorithm.)  The  algorithms  have  been 
determined  sufficiently  robust  to  be  applied  to  engineering,  industrial  and  scientific 
domains  [30].  The  bulk  of  this  work  has  been  done  for  problems  with  only  two  objective 
functions;  however,  some  research  has  been  done  with  problems  involving  three.  Beyond 
that,  the  field  re-names  the  problem  “many-objective  optimization,”  and  there  are  many 
open  questions  there.  In  particular,  it  has  been  shown  that,  as  the  number  of  objectives 
increases,  Pareto  dominance  becomes  nearly  useless  in  ranking  individuals  [37]. 
Furthermore,  it  is  recognized  that  often  the  user  does  not  really  want  to  have  to  evaluate 
all  members  of  the  Pareto  optimal  front,  and  that  incorporating  user  preference  into  the 
EAs  to  reduce  the  returned  solution  set  is  a  high  research  priority  [30]. 

There  are  two  additional  considerations  to  note.  The  first  is  that  EAs  do  not 
explicitly  calculate  gradients  along  the  solution  set.  A  good  fitness  function  and  the 
judicious  use  of  crossover  and  mutation  ensures  that  the  solutions  will  tend  to  follow  the 
gradient  down  to  the  minimum,  but  this  is  accomplished  by  selecting  ever  more-fit 
individuals,  not  by  taking  advantage  of  trends.  In  some  cases,  this  is  a  strength  -  EAs  are 
robust  to  discontinuities  in  the  solution  space,  including  discontinuities  occurring  at 
constraint  boundaries.  Since  they  do  not  calculate  a  gradient,  they  are  unaffected  if  it 
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should  disappear  or  go  to  infinity.  But  when  gradients  are  available,  their  use  would 
greatly  speed  convergence. 

Second,  EAs  are  an  unconstrained  optimization  technique.  There  is  no  explicit 
mechanism  for  enforcing  constraints  such  as  Equations  2.2  and  2.3.  Researchers  treat  the 
problem  in  different  ways.  Sometimes,  constraints  can  be  recast  as  objectives.  An 
assortment  of  penalty  functions  can  be  invoked,  penalizing  the  fitness  of  solutions  that  do 
not  meet  the  constraints  -  although  this  can  run  the  risk  of  degenerating  into  a  random 
walk  if  there  are  no  good  solutions  at  all  in  the  initial  population  [32].  Another  recent 
approach  iteratively  shrinks  the  search  space  to  focus  on  zones  where  the  constraints  are 
met  [38],  These  approaches  work,  some  faster  than  others,  and  most  have  a  set  of 
parameters  (such  as  the  “rate  of  shrink”  in  [38])  that  must  be  set  and  then  tuned  by  the 
user.  If  the  parameters  are  poorly  set,  the  methods  do  not  work  well  at  all. 

2.6  Isoperformance  and  Adaptive  Weighted  Sums 

de  Week  et  al.  have  investigated  other,  more  deterministic  methods  of  developing 
a  Pareto  front.  Isoperformance  [39,  40]  sets  a  required  performance  level,  indicated  by  a 
fixed  value  for  a  cost  function.  This  can  be  the  single  output  of  a  complex  system,  such 
as  the  displacement  of  space  telescope  subjected  to  disturbance  forces.  Through  one  of 
several  algorithms,  the  design  variables  of  the  system  (which  for  the  hypothetical 
telescope  could  include  parameters  as  different  as  its  mass,  the  stiffness  of  its  bending 
modes,  the  bandwidth  of  its  attitude  control  system,  and  even  the  magnitude  of  the  star 
being  used  by  the  guidance  system  to  orient  the  system)  are  varied,  and  those 
combinations  which  give  the  desired  performance  level  are  recorded.  From  that  set,  a 
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nondominated  front  is  further  selected.  A  user  would  then  select  a  single  solution  from 
among  the  nondominated  solutions. 

Adaptive  weighted  sum  methods  [41]  can  be  used  when  the  cost  function  is  of  the 
form  of  a  weighted  sum  of  terms.  Traditional  techniques  for  exploring  the  Pareto  front  of 
such  a  system  would  sample  weights  at  constant  fixed  intervals,  and  so  might  miss  many 
important  features  of  the  front.  The  adaptive  weighted  sum  approach  begins  with  a 
constant  interval  mesh  of  weights,  then  refines  the  mesh  in  areas  where  there  are  large 
gaps  between  returned  cost.  Additional  inequality  constraints  are  also  added  to  restrict 
the  calculations  to  areas  where  the  Pareto  front  is  thought  to  lie. 

These  techniques  seem  very  promising  for  systems  design,  or  in  any  application 
where  a  user  would  want  to  obtain  an  entire  nondominated  set  of  solutions  for 
consideration.  If  we  were  interested  in  only  planning  a  route  between  two  fixed  points, 
the  time  required  to  fonn  the  Pareto  front  of  trajectories  using  one  of  these  techniques 
might  be  worthwhile.  However,  our  interest  is  in  calculating  many  trajectories  in  similar 
but  not  identical  environments.  Since  the  trajectory  generation  process  is  itself 
computationally  intensive,  we  would  like  to  avoid  computing  a  Pareto  front  of  solutions 
for  any  one  set  of  boundary  conditions. 

2.7  Mixed  Integer  Linear  Programming 

Mixed  Integer  Linear  Programming  (MILP)  has  become  increasingly  popular  as  a 
relatively  fast  way  to  generate  and  optimize  trajectories  [42].  The  technique  has  been 
well-known  in  the  operations  research  field  for  many  years,  but  recent  increases  in 
computing  speed  have  allowed  the  technique  to  be  considered  for  real-time  planning 
applications.  Equality  and  inequality  constraints  can  all  be  handled  robustly,  ft  can 
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approximate  non-convex  constraints  and  can  handle  logical  constraints  as  well.  Obstacle 
avoidance  is  done  by  placing  inequality  constraints  directly  on  the  path  space,  forcing  the 
trajectory  outside  the  region  of  the  obstacles;  penalty  functions  are  not  used.  The  cost 
functional  could  in  principle  include  many  terms,  although  applications  typically  focus  on 
either  fuel  or  time  optimal  trajectories.  MILP  has  been  applied  to  a  spacecraft 
rendezvous  problem  [43]  and  to  multi-satellite  reconfiguration  problems  [44], 

However,  as  its  name  suggests,  it  is  suitable  only  for  problems  with  linear 
constraints,  including  dynamic  constraints.  [43]  uses  the  linear  HilTs/Clohessy-Wiltshire 
equations  (although  this  is  not  inappropriate  for  a  docking  maneuver)  and  [44]  linearizes 
gravity-free  dynamics.  MILP  does  not  handle  nonlinear  constraints  at  all,  except  by 
linearizing  them.  This  is  appropriate  for  some  domains,  but  not  for  all.  Control  of  a 
satellite  formation  over  a  highly-elliptical  orbit,  for  example,  is  not  amenable  to 
linearization. 

2.8  Optimal  Control 

Optimal  control  methods  [45,  46]  have  been  used  to  solve  trajectory  planning 
problems  for  many  years.  The  calculus  of  variations  is  used  to  frame  the  problem  as  a 
system  of  differential  equations  subject  to  conditions  imposed  at  the  initial  and  final  time. 
The  equations  can  be  -  and  often  are  -  nonlinear.  System  dynamics  as  well  as  other 
constraints  can  be  included.  Generally,  a  cost  functional  is  of  the  form: 

J=  f '  (2.4) 

o 
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where  x(t)  is  the  state  vector  and  x'( t)  is  its  derivative.  A  variation  in  the  functional,  SJ, 
can  be  defined  for  small  changes  of  g(x(t),x'(t),t).  If  a  relative  minimum  for /exists,  it  is 
necessary  that  SJ  be  zero  at  that  point.  Applying  the  definition  of  SJ  to  Equation  2.4 
yields  the  Euler  Equation: 


ax 


d_ 

dt 


dg_ 

dx' 


(x*(0. 


0 


(2.5) 


where  x*(t)  is  an  extremal  state  vector  and  x'*(t)  its  derivative. 

The  problem  is  to  find  an  admissible  input  (or  control)  vector  u*(t)  that  causes  a 
system  described  by  the  differential  equations  in  Equation  2.6  to  follow  an  admissible 
trajectory  x*(t)  that  minimizes  the  cost  functional  Equation  2.7. 

x'(t)=a(x(t),u(t),t)  (2.6) 


J(«)=  (f  g(x(t),u(t),t)dt  (2.7) 

Jt0 

At  all  points  along  an  admissible  trajectory,  Equation  2.6  holds  and  can  be 
rewritten: 


a(x(t),u(t),t)  -  x' (t)  -  0  (2.8) 

and  added  to  g(x(t),u(t),t)  with  Lagrange  multipliers  X  to  form  an  augmented  cost 
functional: 


Ja(«)  =  f  ga(x(t),x'(t),u(t),X(t),t)dt 


or,  rewriting, 
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(2.9) 


ja(«) = r  g(x(t),u(t),t) 

0 

+  Xr\a(x(t),u(t),t)  -  x'(7)]d/ 

The  extremals  of  the  functional  are  where  5Ja  is  zero.  Finding  8Ja  and  setting  it  to 
zero  results  in  three  necessary  equations.  They  are  most  commonly  expressed  in  terms  of 
the  Hamiltonian,  which  is  defined  as: 

H(x(t),u{t)At),t) 

=g(x(t),u(t),t)+l.T[a(x(t),u(t),t)] 


The  necessary  conditions  are  then: 


P)TT 

OA 

(2.11a) 

PjTT 

l'*(t)  =  -?r(x*(t),u*(t),jl*(t),t) 
ox 

(2.11b) 

o  =  (x  *  (0,  M  *  (0.  ^  *  (0»  t) 

ou 

(2.11c) 

for  all  te[t0,tf] .  For  a  fixed  final  time  and  a  fixed  final  state,  we  have  boundary 
conditions 


x(to)=x0  (2.12a) 

x(t/)=Xf  (2.12b) 
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which  gives  the  equations  needed  to  determine  the  constants  of  integration.  Solving  the 
system  of  equations  returns  the  function  (trajectory)  that  will  minimize  the  cost 
functional. 

If  the  final  time  and  final  state  are  not  fixed  but  are  free,  a  new  boundary 
condition  called  the  transversality  condition  is  produced: 

ga(x(tf),u(tf),l(tf),tf)Stf-)J(tf)Sxf  =0  (2.13) 

When  Xj-  is  fixed,  as  in  this  work,  Sx/=  0,  so  it  must  be  that 
ga(x(tf),  U*(tf),  Utf),  tf)  =  0. 

Except  for  certain  special  cases,  there  is  no  way  to  analytically  solve  the  optimal 
control  problem.  A  variety  of  numeric  methods  have  been  employed,  including  the 
shooting  method  [47]  and  collocation.  The  shooting  method  is  so-called  because  it  uses 
initial  value  problems  (IVPs)  as  a  starting  point  to  “shoot”  towards  the  solution  of  the 
optimal  controls  boundary  value  problem  (BVP).  Unfortunately,  a  stable  BVP  (one 
insensitive  to  changes  in  boundary  values)  may  require  the  integration  of  unstable  IVPs 
(ones  highly  sensitive  to  changes  in  boundary  values).  This  drawback  led  to  the 
development  of  the  collocation  approach. 

In  collocation,  the  actual  solution  to  differential  equations  2.11  is  approximated 
over  a  mesh,  defined  by  “knot  points.”  The  approximation  is  made  to  satisfy  the 
boundary  constraints  at  to  and  tf,  and  further  to  satisfy  Equations  2. 1  la-c  at  each  knot 
point  and  at  the  midpoint  of  each  interval  between  them.  An  initial  guess  for  the  solution 
must  be  provided;  the  solution  technique  will  alter  the  current  solution  estimate  to  bring 
its  residual  (a  measure  of  error)  to  within  acceptable  bounds. 
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There  are  many  ways  to  solve  a  collocation  problem.  Solution  methods  fall  in 
general  into  two  classes:  direct  and  indirect.  Direct  methods  [48]  model  the  approximate 
solution  as  composed  of  basis  functions;  the  solution  is  improved  by  altering  a  vector 
containing  the  coefficients  for  the  basis  functions.  This  allows  vector  optimization 
algorithms  such  as  sequential  quadratic  programming,  Newton-Gauss,  or  Levenberg- 
Marquardt  to  be  applied  [49],  Direct  methods  are  considered  faster  and  more  robust  than 
the  indirect  methods.  Indirect  methods  (as  in,  e.g.,  [50])  link  the  knot  points  with 
continuous  approximating  functions  (e.g.,  cubic  splines)  over  each  subinterval.  The 
coefficients  of  each  of  these  functions  must  then  be  solved.  This  makes  indirect  methods 
generally  more  computationally  intensive  than  direct  methods.  Their  advantage  is 
additional  flexibility;  the  basis  functions  in  the  direct  methods  must  be  chosen  such  that 
every  function  could  be  a  feasible  trajectory.  The  indirect  method  has  no  such  constraint. 

The  optimal  control  problem’s  solution  is  governed  by  a  single  cost  functional. 
Multiple  objectives  can  only  be  optimized  via  an  aggregation  method.  Since  some 
constraints  are  likely  to  be  non-convex,  this  means  that  certain  solutions  along  the  Pareto 
optimal  front  may  be  missed.  Typically,  a  sufficient  number  of  other  solutions  that  also 
satisfy  the  user’s  preferences  also  exist  where  they  can  be  detected. 

Optimal  controls  problems  can  incorporate  constraints  and  discontinuities. 
Equality  constraints  on  the  state  (such  as  satisfying  the  system  dynamics)  are  adjoined  to 
the  cost  functional  via  Lagrangian  multipliers,  as  discussed  above.  Constraints  on  the 
control  inputs  can  be  handled  via  Pontryagin’s  Minimum  Principle  and  the  resulting 
switching  curves.  Inequality  constraints  can  be  handled  by  the  introduction  of  a  function 
of  a  dummy  variable,  xn+i,  whose  derivative  is  defined  as: 
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[fjxm]2  i\  (-  fj+ [f2(x(t)j)]2  i\  (-  f2)+ ...+ [f'ixm]2  i\  (-  f, )  (2.i4) 


where  ff  (-/’ )  is  a  unit  Heavy  side  function  defined  by: 


ti(-ft) 


jO,  f(x(t),t)>  0 
U  fi(x(t)>t)  <  0 


(2.15) 


for  /  =  1, 2,...,  /  (where  /<  m,  the  size  of  the  control  vector).  xn+1  can  then  be  defined  as: 

*„+i  (0  =  f'  *B+i  (0*  +  *B+i  (to )  (2- 16) 

Jt0 

We  then  require  boundary  conditions  x„+i(to)  =  0  and  xn+i(tf)  =  0.  Since  the  derivative  is 
never  less  than  zero,  xn+i(t)  must  be  zero  for  all  t.  This  is  a  constraint  of  the  fonn 
f(x(t),t)  =  0,  and  it  can  be  treated  by  the  method  of  Lagrange  multipliers. 

However,  the  switching  curves  and  Heavyside  function  are  clearly  discontinuous, 
making  them  problematic  for  many  numeric  solvers.  They  can  be  approximated  by  a 
series  of  increasingly  steep  polynomials  to  avoid  this.  However,  the  unchanging  nature 
of  xn+ 1  presents  a  further  problem.  Collocation  solvers  require  gradient  information  to 
reduce  the  error  between  the  current  approximate  solution  and  the  true  solution.  Since 
xn+i  is  identically  zero  for  the  entire  trajectory,  it  provides  no  gradient  information  at  all. 
In  our  particular  case,  the  collocation  solver  required  a  Jacobian  matrix  which,  when  the 
Heaviside  approximation  was  added,  contained  a  full  row  of  zeroes  and  so  would  not 
solve.  Adjusting  the  Heaviside  approximation  further  to  provide  some  kind  of  gradient 
information  in  the  allowable  solution  region  amounted  to  instituting  a  penalty  function, 
which  is  another  way  to  treat  state  inequality  constraints. 
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Penalty  functions  are  often  used  in  path  and  trajectory  planning  for  obstacle 
avoidance.  Often  cubic  in  form,  these  penalty  functions  are  centered  over  an  obstacle  and 
monotonically  decrease  as  they  move  away  from  its  center.  Typically,  they  go  to  zero  at 
some  influence  limit  away  from  the  obstacle,  but  this  is  not  required.  Figure  2  shows  the 
piecewise  continuous  penalty  function  used  for  obstacle  avoidance  in  the  2DOF  work  in 
Chapter  4.  It  has  a  fixed  value  at  the  object’s  center,  at  the  edge  of  the  object,  and  at  a 
fixed  distance  from  the  edge  of  the  object.  The  coefficients  of  the  cubic  functions  used 

can  be  varied  to  achieve  these  conditions  for  obstacles  of  different  sizes. 
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Figure  2:  A  potential  function 

The  segment  of  the  function  that  extends  from  the  edge  of  the  obstacle  to  the  end 
of  the  “influence  limit”  some  distance  away  provides  uniform  penalties  for  approaching 
obstacles  of  any  size.  No  matter  how  large  or  small  the  obstacle,  this  segment  of  the 
function  persists  from  its  outer  edge  (where  contact  would  occur)  to  the  influence  limit. 
The  segment  which  lies  over  the  obstacle  itself  aids  in  computation.  If  the  penalty  over 
the  obstacle  was  flat,  and  a  possible  path  was  plotted  through  the  obstacle,  there  would  be 
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no  infonnation  available  to  the  solver  to  make  a  decision  about  moving  the  path.  It 
would  be  clear  that  this  was  an  expensive  path,  but  perturbations  around  it  would  cost 
effectively  the  same.  With  this  second  segment,  the  solver  can  quickly  determine  a 
gradient  that  will  “roll”  it  off  of  the  obstacle. 

The  value  of  the  penalty  function  is  added  to  the  cost  functional.  As  cost  is 
minimized,  the  trajectory  will  tend  to  be  moved  away  from  the  obstacles.  However,  if 
other  costs  are  sufficiently  great,  it  may  be  numerically  less  expensive  to  accept  the 
penalty  -  which  means  planning  a  path  through  the  obstacle.  Penalty  functions  do  not 
offer  guarantees  on  constraint  satisfaction,  which  means  that  solutions  generated  via 
optimal  control  methods  must  be  checked  in  a  post-processing  step. 

Using  penalty  functions  to  disallow  regions  of  the  velocity  space  was  attempted  as 
part  of  this  research.  However,  doing  so  caused  an  unacceptable  slow-down  of  trajectory 
planner  performance.  The  planners  used  in  general  have  a  more  difficult  time, 
computationally,  when  there  are  many  obstacles  near  the  generated  trajectory,  and  the 
obstacles  in  velocity-space  were  apparently  too  near  the  solution  velocity.  This  may  not 
be  a  problem  with  another  trajectory  planner,  and  would  be  a  valid  technique  for 
enforcing  these  limits.  However,  as  with  penalty  functions  used  for  physical  obstacles, 
there  is  always  a  chance  that  there  will  be  a  trajectory  whose  minimum  cost  lies  within 
the  obstacle,  if  the  penalty  function  is  not  sufficiently  large.  Trajectories  would  still  have 
to  be  checked  for  these  failures,  and  some  parameters  of  the  penalty  function  adjusted  to 
shift  the  trajectory  to  the  outside  of  the  obstacle. 
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2.9  Fuzzy  Set  Theory 


The  main  idea  behind  fuzzy  set  theory  is  that  a  member  of  a  set  may  belong  only 
partly  to  that  set  [2].  Classically,  individuals  either  are  or  are  not  contained  in  a  set;  they 
are  either  hot  or  not  hot,  for  example.  In  fuzzy  set  theory,  an  individual  may  be  50%  hot 
and  50%  not  hot,  or  30%  hot  and  50%  warm.  Complements,  like  “hot”  and  “not  hot” 
must  sum  to  100%  but  non-complementary  attributes  may  not.  For  example,  the  vertical 
line  in  Figure  3  indicates  that  generic  feature  value  F  is  about  45%  “low,”  about  60% 
“medium,”  and  0%  “high.” 


% 


Figure  3:  A  fuzzy  membership  function 

The  triangles  in  Figure  3  are  membership  functions.  They  correlate  “crisp” 
numeric  values,  as  measured  in  the  real  world,  to  these  fuzzy  levels.  A  fuzzy  rules  set 
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then  acts  on  these  “fuzzified”  inputs.  For  example,  “If  air  temperature  is  LOW,  turn 
heater  fan  to  “HIGH”  and  “If  air  temperature  is  MEDIUM,  turn  heater  fan  to  LOW.”  The 
fans  speeds  will  have  similar  fuzzy  membership  function  correlating  speeds  like  “HIGH” 
and  “LOW”  to  revolutions  per  minute.  These  outputs  are  scaled  by  the  membership 
function  of  the  inputs. 

Natural  language,  while  a  desirable  input  modality,  is  inherently  ambiguous. 

From  interpreting  sounds  into  words  to  parsing  the  words  into  sentences  to  interpreting 
the  possible  shades  of  meaning  of  a  sentence,  there  are  ambiguities  that  must  be  dealt 
with.  Classical  mathematics  does  not  manage  ambiguities  well.  Fuzzy  techniques,  on  the 
other  hand,  deal  with  them  substantially  better  [3],  Ambiguities  lend  themselves  to 
problems  involving  shades  of  meaning  or  even  slight  differences  in  pronunciation. 

Fuzzy  optimization,  a  fairly  new  field,  applies  fuzzy  set  theory  to  optimization 
problems.  Fuzzy  techniques  are  not  themselves  used  to  solve  the  problem,  but  are  rather 
applied  to  candidate  solutions  to  rank  them.  They  are  often  used  in  conjunction  with 
EAs,  where  the  evolutionary  algorithms  generate  the  candidate  solutions  and  the  fuzzy 
methods  are  used  to  rank  them  before  selection  and  breeding  occurs.  Recent  work  [6] 
has  shown  that  an  expanded  and  fuzzified  notion  of  Pareto  dominance  seems  to  perform 
more  in  accord  with  common  sense  than  strict  Pareto  dominance,  and  should  not  have  the 
same  problem  as  Pareto  dominance  (e.g.,  that  all  solutions  become  equally  good)  as  the 
number  of  objective  functions  increases  to  infinity. 
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3  Architecture 


The  problem  with  which  we  are  faced  is  this:  to  generate  a  dynamically  feasible 
trajectory  for  some  autonomous  vehicle  which  also  satisfies  user- imposed  constraints. 
Some  of  these  constraints  are  “soft,”  and  indicate  a  user’s  preference  for  a  solution  in 
some  region  of  the  state  space.  Other  constraints  are  “hard”  and  indicate  regions  of  the 
state  space  from  which  the  vehicle  is  forbidden.  Under  these  constraints,  we  still  wish  to 
optimize  the  trajectory  with  respect  to  fuel  and/or  time.  We  will  use  a  vector  of  weights 
to  direct  the  solution  to  the  preferred  regions  of  state  space,  automatically  adjusting  the 
vector  if  an  acceptable  solution  is  not  found. 

Figure  4  shows  an  outline  of  the  agent’s  processes.  At  the  center  sits  the 
evaluation  model,  overseeing  all  activities.  The  human  user  interacts  with  this  module, 
monitoring  events  rather  than  directly  participating  in  trajectory  generation  processes. 
The  evaluation  module,  EVAL,  accepts  a  planning  problem,  P0,  which  can  be  posed  by 
the  user  or  by  any  suitable  high-level  planner  that  builds  task-level  actions  to  achieve  its 
goals,  some  of  which  may  require  vehicle  motions. 
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Figure  4:  System  architecture 

A  trajectory  planning  problem  P0  is  defined  as  (D,  {O},  H0,  So,  be} .  Domain  D 
describes  system  dynamics  and  the  parameterized  cost  functional  J  to  be  minimized.  {O} 
represents  the  set  of  obstacles  in  the  environment.  H°  describes  the  hard  constraints 
(limits  on  state  space  values)  to  be  met,  whereas  So  is  a  set  of  soft  constraints  that  indicate 
user  preference,  but  are  ultimately  flexible.  H°  are  numeric;  S°  may  be  numeric  or  fuzzy 
linguistic  terms.  Fuzzy  terms  must  eventually  be  converted  into  soft  numeric  limits;  L, 
the  set  of  all  limits,  includes  H°  and  the  extended  S°.  Members  of  L  may  be  upper  limits, 
lower  limits,  or  range  limits  (when  we  want  the  state  feature  to  be  within  an  upper  and  a 
lower  limit).  The  boundary  conditions  be  =  {to,  Xo,  X/}  are  split  and  can  include  all  of  the 
usual  optimal  controls  cases  (e.g.,  fixed  or  free  final  time  or  state,  final  state  constrained 
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to  a  fixed  or  moving  surface).  The  goal  is  to  return  feasible  and  optimal  solution  X  =  {f, 
L",  f,  x",  u" } ,  where  J"  and  Ln  summarize  solution  cost  and  the  feature  limits/constraints, 
respectively,  of  the  nth  iteration  and  the  set  {t'\  x",  u"  }  specifies  the  full-state  trajectory 
(i.e.,  time  sequence  t„,  position/velocity  vector  sequence  x„,  control  inputs  u„)  to  be 
executed.  This  goal  is  achieved  through  intelligent  selection  of  a  trajectory  planning 
function  and  selection  and  adjustment  of  a  weight  vector  Q'  that  influences  the  relative 
importance  of  terms  in  the  cost  functional  J.  EVAL  incrementally  builds  a  history  of 
activities  {HIST}  =  {HIST1,  HIST2,  ..  . }  with  HIST'  including  a  record  of  the  function 
used  by  TPLAN,  the  initial  solution  estimate,  and  the  weight  vector  Q‘  used  for  the  zth 
iteration.  EVAL  can  then  use  {HIST}  to  identify  which  weight  adjustment  strategies  it 
has  already  employed,  to  avoid  infinite  loops.  Figure  5  shows  the  possible  paths  through 
the  architecture. 
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Figure  5:  Paths  through  the  architecture 


INIT  initializes  the  problem  state,  Po,  selecting  a  trajectory  planning  function  and 
providing  a  vector  of  cost  functional  weights  Q1  and  an  initial  guess  for  the  trajectory  Xo, 
if  needed.  Next,  TPLAN  generates  an  optimal  trajectory  based  on  this  initial  Po.  FEXT 
extracts  the  relevant  trajectory  features,  F1,  and  returns  them  to  EVAL.  If  all  elements  of 
F1  are  within  the  limits  L°,  the  trajectory  is  deemed  a  satisficing  solution  and  the  solution 
X  is  returned  by  RETURN  to  the  user  and  the  higher-level  strategic  planner.  Otherwise, 
the  nature  of  the  limit  failures  is  analyzed.  If  the  user-imposed  hard  limits  (e.g.,  those 
which  must  be  satisfied)  are  not  met,  WADJ  is  called  to  determine  new  weights  Q,+1 . 
EVAL  then  checks  {HIST}  for  a  cycle  in  the  weights;  if  one  has  occurred,  the  process 
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fails.  This  loop  continues  until  the  hard  limits  are  met.  If  the  hard  limits  are  met  but  the 
user-imposed  soft  limits  (e.g.,  those  which  we  prefer  to  be  satisfied)  fail,  the  process 
enters  a  second  loop.  The  goal  in  this  case  is  to  improve  the  solution  so  that  it  meets  all 
of  the  limits,  hard  and  soft,  but  to  provide  some  solution  within  a  limited  time.  The  user 
may  adjust  this  time  limit  to  reflect  his  desire  for  constraint  satisfaction  versus  his  desire 
for  timely  results.  (The  loop  which  attempts  to  meet  H°  does  not  have  a  similar 
time  limit  because  we  assume  that  the  user  has  no  use  for  the  illegal  trajectory;  that  loop 
will  run  until  H°  are  met,  the  process  fails,  or  the  user  interrupts  it.)  In  any  case,  the 
trajectory  which  satisfied  the  hard  limits  but  not  the  soft  limits  is  stored,  so  that  a  legal 
trajectory  is  guaranteed  to  be  returned.  The  process  continues  as  for  the  hard  limit  case, 
with  the  exception  that  if  a  trajectory  is  found  which  again  satisfies  the  hard  limits  but  not 
the  soft  ones,  this  solution  is  compared  to  the  prior  besttrajectory.  If  it  is  a  “better” 
solution  than  best  trajectory,  it  replaces  best  trajectory  in  memory.  Currently,  the  2- 
norms  of  the  respective  error  margin  vectors,  marginerror,  are  compared  to  determine 
which  solution  is  “better,”  and  the  solution  with  the  smaller  overall  error  margin  nonn  is 
kept.  The  error  margins  are  defined  as: 


margin‘errorj 


-4 


0, 


FhoL°j 

FioL) 


(3.1) 


where  the  operator  0  indicates  that  a  features  meet  its  corresponding  limit,  whether  they 
are  below  an  upper  bound,  above  a  lower  bound,  or  within  a  range.  For  the  /th  iteration 
of  the  trajectory  planner,  the  /th  feature  is  compared  to  the  /th  element  of  the  vector  of 
limits.  The  elements  of  marginerror  were  not  normalized  for  limit  size  in  its  first  version 
and  an  upgrade  was  never  made;  certainly  some  sort  of  normalization  would  be  more 
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appropriate.  An  element  of  margin error  is  negative  when  a  lower  limit  is  violated  and 
positive  when  an  upper  limit  is  violated.  (If  Lj  is  a  range  limit,  the  end  of  the  range  that 
is  violated  is  used.)  When  the  2-norm  is  taken,  all  the  values  are  of  course  positive. 
However,  the  sign  of  marginerTorj  is  important  because  it  is  used  by  WAD J  to  determine 
the  direction  of  the  weight  change. 

We  now  examine  each  of  the  architecture  components  in  more  detail. 

3.1  Initialization 

To  guide  the  search  toward  an  acceptable  solution,  the  initialization  routine  INIT 
(Figure  6)  translates  knowledge  about  the  domain  D,  constraints  L°  and  obstacles  {O} 
into  choices  for  the  trajectory  generation  routine  TPLAN,  any  seed  information*# 
required  by  TPLAN,  and  an  initial  weight  vector  Q1 . 
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EVAL 


Figure  6:  Initialization  procedure 


Most  any  trajectory  generation  tool  set  that  optimizes  over  a  weighted  cost 
function  can  be  incorporated  into  the  architecture.  With  multiple  tools  in  place, 
information  for  choosing  between  them  must  be  made  available.  Figure  6  illustrates 
choices  between  a  mixed  integer  linear  programming  module  (MILP)  [26],  a  receding 
horizon  planner  (RHP)  [49],  and  MATLAB’s  collocation-based  boundary  value  solver 
BVP4C  [50].  User-provided  infonnation  as  well  as  domain  information  guides  the 
choices,  although  making  a  choice  among  multiple  solvers  is  beyond  the  scope  of  this 
work  which  relies  strictly  on  collocation,  the  strategy  we  consider  more  flexible  than 
MILP  given  nonlinear  dynamics  and  more  mature  than  receding  horizon  algorithms. 
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Depending  on  the  choice  of  trajectory  planner  TPLAN,  some  initial  guess  may 
need  to  be  supplied  to  the  trajectory  generator  (e.g.,  for  collocation).  We  use  a  cubic 
spline  which  satisfies  boundary  conditions  x0  and  X/.  In  the  future,  it  may  be  desirable  to 
use  a  Rapidly-expanding  Random  Tree  (RRT)  [23]  if  the  area  in  which  the  robot  moves 
is  very  cluttered  with  obstacles.  The  RRT  can  find  a  dynamically  plausible  (but  non- 
optimal)  trajectory  through  the  space.  This  solution  can  then  be  used  as  an  initial  guess 
for  one  of  the  optimization  routines. 

Once  TPLAN  and  any  inputs  it  requires  are  chosen,  the  limits  L°  are  used  to 
compute  the  initial  weight  vector  Q1  and  an  extended  version  of  the  limits  themselves  as 
described  below.  When  no  hard  or  soft  constraints  are  specified,  INIT  defaults  to  equal 
weights  (Li' default)  for  all  terms  of  the  cost  functional. 

This  computation  of  Qi,  shown  in  Figure  7,  accepts  limits  in  either  the  form  of 
qualitative  adverbs  (e.g.,  “quickly”  or  “safely”)  with  optional  adverbial  modifiers  (e.g., 
“very”)  or  numeric  constraints  on  a  trajectory  feature  (e.g.,  “maximum  speed  <  5  m/s”). 
The  numeric  constraints  may  be  either  hard  or  soft. 
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Figure  7:  Calculating  Q1  and  extended  S° 


We  deal  firstly  with  the  adverb  constraints,  all  of  which  are  considered  soft 
constraints.  We  define  each  adverb  that  our  system  understands  in  terms  of  the  trajectory 
features  we  can  extract.  For  example,  “quickly”  involves  maximum  forward  speed  ( max- 
speed )  and  average  forward  speed  ( avg-speed ).  We  further  define  fuzzy  levels  for  each 
feature.  “Quickly”  involves  “high”  max-speed  and  “high”  avg-speed.  These  definitions 
are  stored  in  the  fuzzy  language  database  V. 

The  ranges  for  these  values  are  defined  based  on  vehicle  performance  constraints 
and  a  typical  definition  of  each  adverb  relative  to  these  constraints.  We  used  the  same 
data  that  generated  the  weight  adjustment  curves  to  correlate  feature  levels  to  weight 
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levels  (see  “Weight  Adjustment,”  below),  resulting  in  our  fuzzy  rule  set  Z.  We  cannot 
use  the  weight  adjustment  equations  themselves,  as  they  require  constants  that  cannot  be 
calculated  until  the  first  set  of  trajectory  data  is  available.  But  we  can  say  that  “low” 
values  of  a  given  weight  produce  “high”  values  of  avg-speed.  We  define  ranges  for  the 
weight  levels  as  well.  Since  all  of  our  fuzzy  levels  are  defined  geometrically  as  triangles 
with  fixed  endpoints,  their  centroids  are  fixed  along  the  weight  axis.  Defuzzification  for 
each  weight  correlated  with  each  feature  is  the  process  of  looking  up  this  centroid  value. 
This  numeric  infonnation  is  also  contained  within  the  fuzzy  rule  set  Z  and  is  represented 
by  the  generic  fuzzy  membership  function  depicted  in  the  upper  right  of  Figure  7. 

Any  numeric  soft  constraints  in  S°  are  then  also  fuzzified  (so  that 
“4  m/s  <  max-speed  <  5  m/s”  becomes  “ max-speed  high”)  and  appropriate  fuzzy  weight 
levels  found  by  referencing  Z.  (In  our  current  implementation,  we  only  dealt  with  and  so 
only  treat  soft  numeric  range  constraints.  Soft  upper  and  lower  constraints  would  be 
treated  like  hard  constraints;  see  below.)  We  do  this  because,  at  this  point  in  the 
algorithm,  we  lack  quantitative  equations  which  could  directly  correlate  hard  or  soft 
numeric  constraints  directly  to  weight  values.  We  have  a  general  understanding  that 
“ max-speed  high”  requires  “time  weight  high,”  and  fuzzy  estimates  of  what  values 
constitute  “high”  for  both  parameters.  But  we  as  yet  lack  a  predictive  equation  which 
would  accept  as  input  a  desired  feature  value  and  give  as  output  an  estimated  weight. 
After  our  first  trajectory  has  been  computed,  we  can  then  solve  for  certain  parameters  that 
do  allow  such  estimation,  but  here  in  INIT  we  do  not  have  the  required  data. 

Taken  all  together,  these  S°  represents  the  user’s  preference  for  the  vehicle’s 
behavior.  It  now  remains  to  combine  these  preferences  into  a  single  weight  vector,  and 
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then  to  detennine  if  the  result  is  likely  to  satisfy  any  hard  constraints  H°  (the  user’s 
requirements  on  the  vehicle’s  behavior)  that  have  been  given. 

To  combine  all  of  these  weights,  we  find  a  centroid  that  represents  an  average 
over  each  feature’s  “ideal”  weight  vector.  The  “mass”  used  for  this  centroid  computation 
is  an  optional  adverbial  qualifier  that  can  be  stated  in  the  adverb  constraint.  So  “safely 
but  a  little  quickly,”  where  “a  little”  is  the  adverbial  qualifier,  will  place  more  emphasis 
on  weights  resulting  from  features  related  to  “safely”  than  weights  computed  from  the 
“quickly”  term. 

Now  we  have  our  initial  weight  vector  Q1 .  If  there  are  additional  hard  numeric 
constraints  on  features  in  H°,  we  fuzzify  those  constraints  as  we  did  for  the  numeric  soft 
range  constraints.  Then  we  check  if  the  current  weight  vector,  when  fuzzified  via  the 
rules  in  Z,  correlates  to  that  fuzzy  level.  That  is,  if  we  require  a  hard  limit  “ max-speed  < 
high”  and  the  relevant  weight  in  Q1  is  “low,”  are  we  likely  to  generate  an  acceptable 
behavior?  If  we  are  not,  the  weight  value  in  the  required  fuzzy  range  that  is  closest  to  the 
current  weight  value  is  selected.  So,  to  continue  the  example,  if  we  require  a  “high” 
weight  to  elicit  “high”  max-speed,  but  our  current  weight  is  “low,”  the  algorithm  will 
select  the  value  in  the  “high”  fuzzy  weight  triangle  closest  to  the  “low”  triangle.  Since 
this  overrides  the  centroid  calculation  that  was  built  on  user  preference,  it  is  done  only  for 
hard  limits  H°,  which  we  assume  the  user  needs  rather  than  wants. 

If  this  work  is  extended  to  include  soft  upper  or  lower  numeric  constraints,  a 
similar  checking  procedure  would  be  used  to  see  if  they  would  probably  be  satisfied  by 
Q1  before  the  hard  constraint  check  had  been  performed. 
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3.2  Trajectory  Generation 

The  TPLAN  module  takes  the  weight  vector  Q1  calculated  by  IN  IT  as  well  as  any 
initial  estimates  it  requires  for  itself  that  IN  IT  generates.  It  returns  a  full  state  trajectory, 
including  position,  velocity,  and  control  inputs  at  each  time  step.  The  only  requirement 
we  place  on  how  TPLAN  does  this  is  that  it  must  accept  some  vector  of  adjustable 
weights  which  we  can  manipulate  to  change  the  features  of  the  trajectory.  We  do  not  even 
require  that  TPLAN  return  optimized  trajectories,  although  we  have  chosen  to  use  such  a 
TPLAN.  The  computational  process  of  finding  the  optimal  trajectory,  given  user 
preferences,  was  not  the  primary  focus  of  this  research.  Users  can  incorporate  their 
preferred  trajectory  generator  ( TPLAN)  into  this  architecture,  so  long  as  it  meets  our 
requirements. 

The  current  implementation  assumes  a  cost  functional  in  the  form  of  a  weighted 
sum  of  terms.  If  the  trajectory  generation  process  were  to  use  substantially  different 
adjustable  parameters,  new  strategies  for  WADJ  would  have  to  be  developed.  For 
example,  an  evolutionary  algorithm  which  used  a  weighted  sum  as  its  fitness  function 
could  be  used  as  a  TPLAN,  since  by  changing  the  weights  we  can  change  what  is  a  “fit” 
trajectory  and  hence  what  the  EA  will  return.  But  a  multi-objective  EA  that  uses  Pareto 
optimality  as  a  fitness  function  would  be  different,  since  it  by  nature  returns  a  whole 
family  of  nondominated  trajectories.  Injecting  user  preference  into  multi-objective 
optimization  is  an  active  area  of  research  and  techniques  similar  to  the  ones  described 
here  may  be  helpful,  but  some  work  would  have  to  be  done  to  adapt  them  to  the  field. 

In  our  experimental  domains,  we  used  the  collocation-based  BVP4C  solver  and 
Henshaw’s  extension  BVP4C2  [51]  in  MATLAB  for  the  split  boundary  value  problem. 
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We  found  that  we  needed  to  make  several  assumptions  to  get  the  code  to  work,  some  of 
which  may  affect  the  probability  that  the  solution  returned  is  the  true  optimal  for  the 
given  parameterized  cost  functional  and  not  just  a  local  minimum. 

First,  although  BVP4C  and  BVP4C2  can  theoretically  solve  for  a  free  end  time,  to 
make  a  problem  with  free  end  time  converge  to  a  solution  requires  a  very  good  initial 
guess,  both  as  to  the  shape  and  the  duration  of  the  trajectory.  Without  such  an  educated 
guess,  the  solver  returns  a  nonconvergent  (invalid)  solution.  We  instead  provided  the 
solver  with  an  array  of  possible  final  times  and  made  the  assumption  of  smoothness 
between  them.  The  solver  iterated  over  each  possible  final  time  in  the  array  and  the  costs 
for  each  resulting  trajectory  were  compared.  If  the  lowest  cost  trajectory  was  somewhere 
in  the  middle  of  the  array,  the  times  on  either  side  of  the  lowest-cost  trajectory  were  taken 
as  new  upper  and  lower  time  limits  for  a  new  array  with  smaller  steps  between  final 
times.  If  the  lowest-cost  trajectory  were  at  either  end  of  the  array,  the  current  time  step 
was  preserved  and  the  lowest-cost  trajectory  was  used  as  either  the  new  high  or  low  end 
of  the  array.  The  lowest  possible  completion  time  was  set  for  1  sec;  if  forming  the  new 
array  would  require  a  final  time  less  than  one  second,  the  lowest  possible  time  was  set  to 
1  sec  and  the  time  step  recomputed  to  fit  evenly  between  1  sec  and  the  high  end  of  the 
time  array.  Second,  although  the  collocation  routines  are  fairly  robust,  they  are  still 
sensitive  to  certain  numeric  artifacts.  We  discovered,  for  instance,  poor  convergence  for 
certain  final  times.  The  algorithm  would  converge  well  for  some  tj+dt  and  for  tf-  dt,  but 
for  tf  itself,  no  good  answer  would  be  found.  Other  variables,  including  continuation 
schedules  (where  obstacle  potential  functions  or  vehicle  step-response  outputs  are  slowly 
brought  from  some  smooth  approximation  to  their  sharper,  final  shape)  and  obstacle 
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placement  (obstacles  symmetric  with  respect  to  the  initial  path  guess  especially)  could 
also  cause  problems.  Unfortunately,  our  automated  data  generation  scheme  did  not  allow 
for  the  easy  detection  of  these  cases,  nor  did  we  have  the  time  to  hand-tune  every 
parameter  in  every  optimization  run  to  get  the  best  results.  (Dozens  to  hundreds  of 
optimal  trajectories  are  generated  for  each  set  of  L°  that  we  attempt  to  satisfy.)  In  many 
of  these  cases,  the  nonconvergence  was  not  pathological;  desired  error  bounds  of  0.1  m 
might  be  exceeded  by  errors  of  0.2  m,  for  instance.  Further,  these  cases  were  rarely  the 
lowest-cost  trajectories,  and  we  assumed  that  we  could  validly  select  the  lowest-cost 
convergent  trajectory  in  the  presence  of  nonconvergent  trajectories.  Those  cases  where 
nonconvergent  trajectories  were  selected  as  optimal  will  be  discussed  in  the  relevant 
Results  sections  in  Chapters  4  and  5. 

3.3  Feature  Extraction 

Feature  extraction  ( FEXT)  is  a  computational  routine  that  takes  as  an  input  the 
generated  trajectory  and  extracts  from  it  certain  gestalt  properties  useful  in  evaluating  the 
trajectory.  Total  battery  power  expended,  total  time,  maximum  speed,  average  speed,  and 
maximum  acceleration  during  the  traverse  are  typical  features.  A  complete  list  of 
features,  together  with  the  qualitative  adverbs  they  define,  is  found  in  Appendix  A.  Some 
are  maximum  or  minimum  values  which  are  straightforward  to  express  in  L°;  others  are 
averages  or  percentile  values  that  give  an  overall  impression  of  the  trajectory.  The 
“percent  plateau”  values,  for  example,  are  the  output  of  a  routine  which  checks  the 
velocity  and  acceleration  profiles  for  significant  periods  of  time  (at  least  10%  of  the  total 
duration)  during  which  the  relevant  value  fluctuates  no  more  than  1%  of  its  total  range. 
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This  was  intended  to  give  a  numerical  approximation  to  the  human  technique  of  looking 
at  a  trajectory  profile  and  estimating  how  “flat”  it  is. 

The  features  in  Appendix  A  are  very  loosely  domain-dependent.  More 
accurately,  they  can  be  defined  independently  of  a  domain,  but  simply  may  not  apply  to  a 
particular  domain.  An  average  rotational  rate  is  meaningless  in  a  2DOF  simulation  that 
has  only  linear  motion  in  the  x-y  plane.  Average  vehicle  separation  is  inapplicable  to  a 
single  vehicle  domain.  Likewise,  certain  adverbs  or  verbs  may  not  be  relevant  to  all 
domains,  even  though  they  exist  outside  of  them.  We  might  prefer  an  Army  field  vehicle 
to  move  “stealthily,”  but  there  is  little  call  to  require  a  space  robot  to  behave  in  such  a 
fashion. 

3.4  Weight  Adjustment 

The  development  of  good  weight  adjustment  (WADJ)  heuristics  was  a  key  part  of 
this  work.  Our  goal  was  to  automate  the  process  by  which  cost  functional  weights  are 
tuned.  This  is  typically  done  by  hand,  by  a  domain  expert,  until  the  desired  results  are 
achieved.  We  have  attempted  to  encode  these  desired  results  into  the  limits  L°,  as 
functions  of  the  features  defined  above.  What  remains  is  to  extract  domain  expert 
knowledge  and  techniques  and  automate  the  adjustment  process. 
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Figure  8:  Developing  WADJ  rules 


Early  in  this  research  we  discovered  that  many  of  the  features  in  our  set  could  be 
expressed  as  functions  of  the  weights  used  in  the  cost  functional.  The  cost  functional  for 
the  2DOF  linear  domain  contained  three  fairly  typical  terms:  one  penalized  electrical 
energy  use,  another  penalized  time,  and  the  last  was  the  obstacle  penalty  function  shown 
in  Figure  2.  The  cost  functional  for  the  6DOF  nonlinear  domain  contained  four  terms, 
penalizing  fuel  used  for  thrust,  electrical  energy  used  for  torque  control,  time,  and  an 
obstacle  penalty  function.  Despite  the  different  dimensionalities,  cost  functionals,  and 
system  dynamics,  we  were  able  to  treat  the  generation  of  our  WADJ  heuristics  in  a  similar 
fashion  in  both  cases. 
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Each  test  matrix  covered  a  combinatorial  set  of  cost  functional  term  weights,  Q. 
Since  a  cost  functional  can  always  be  normalized,  we  knew  that  we  would  be  looking  at 
relative  weights  rather  than  be  concerned  with  their  absolute  magnitudes.  Early 
experimentation  led  us  to  conclude  that  a  range  of  two  orders  of  magnitude,  from  0.1  to 
10,  would  be  sufficient  to  see  a  broad  range  of  dynamic  behavior  in  our  systems.  (If  this 
were  ever  not  the  case,  it  would  be  a  simple  matter  to  extend  the  scope  of  the  weight 
vector  to  see  an  even  broader  range  of  behavior.)  We  allowed  each  individual  weight  to 
vary  through  { 1,  2,  4,  8}  which,  in  combination,  gave  us  relative  weights  from  2'  to  2  , 
which  covered  our  two  orders  of  magnitude. 

We  varied  the  magnitude  of  the  commanded  motions  and  also  the  number  of 
obstacles  in  the  field.  This  was  to  ensure  that  the  answers  we  were  getting  were  not  too 
specific  to  a  single  domain  subcase.  At  least  one  obstacle  was  needed  to  test  features  like 
minimum  separation  to  obstacles  ( min-sep );  adding  more  obstacles  would  show  how 
performance  changed  as  the  field  became  more  cluttered.  The  empty  field  and  single 
obstacle  test  cases  could  be  performed  very  quickly,  because  the  absence  of  many 
obstacles  greatly  simplifies  solving  the  problem. 

Once  we  had  collected  the  trajectories,  we  used  FEXT  to  compute  the  overall 
trajectory  features  in  which  we  were  interested.  For  features  relating  to  time  (e.g., 
velocity,  acceleration,  power),  we  found  strong  power  relationships  between  the  feature 
values  and  the  ratio  of  the  energy  or  fuel  term  weight  and  the  time  term  weight  (W1/W2). 
That  is: 


time  _ feature 


(3.2) 
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The  exponent  -a  in  these  equations  stayed  fairly  constant  across  field  sizes,  although  the 
constant  coefficient  cj  varied. 

Similarly,  for  path-based  features  like  minimum  separation  from  obstacles  (min- 
sep),  there  was  a  linear  relationship  between  the  feature  and  the  influence  limit  ( LIM)  in 
the  obstacle  penalty  function: 


path  _  feature  =  c2LIM  (3.3) 

Figure  2  shows  what  LIM  is.  There  is  an  elbow  in  the  obstacle  penalty  function  that 
occurs  at  the  edge  of  the  obstacle;  in  Figure  2,  this  occurs  at  a  value  of  1  m  on  the  x-axis. 
Some  distance  later,  at  2  m,  the  obstacle  penalty  function  goes  to  zero.  The  distance  past 
the  obstacle  edge  over  which  the  penalty  function  goes  to  zero  is  LIM.  Figure  9  shows  an 
obstacle  (circle  with  solid  red  line).  The  obstacle  penalty  term  was  also  weighted,  but  the 
effect  of  the  weight  was  nearly  negligible  in  comparison  to  varying  LIM,  as  shown  in 
Figure  9. 
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Figure  9:  Effect  of  changing  obstacle  penalty  weight  (solid  lines)  and  obstacle  penalty  function 
influence  limit  (dashed  lines)  on  separation  from  obstacle 

The  dotted  red  line  in  Figure  9  corresponds  to  one  LIM  value.  The  black  line 
closest  the  obstacle  is  a  path  which  goes  within  this  LIM.  That  isn’t  a  problem  by  itself; 
we  assume  the  trajectory  generator  found  the  cost  incurred  by  getting  closer  to  the 
obstacle  was  less  than  some  other  cost.  However,  we  show  a  hypothetical  constraint  on 
path  nearness  to  obstacles,  min-sep,  in  green.  (This  min-sep  was  not  used  for  these  data 
runs;  this  is  simply  an  illustration.)  In  this  case,  the  path  is  less  than  min-sep  away  from 
the  obstacle.  This  is  a  constraint  violation,  and  the  cost  functional  would  be  adjusted  to 
correct  it.  As  Figure  9  shows,  the  most  efficient  way  to  adjust  path  nearness  to  obstacles 
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is  to  change  LIM.  Increasing  LIM  increases  the  distance  between  the  vehicle  and  the 
obstacle,  meeting  the  required  min-sep  easily. 

Rather  than  attempt  to  calculate  tables  for  all  possible  constant  coefficients  cj  and 
C2  of  these  equations  for  all  possible  field  sizes,  they  are  computed  online,  using  the 
current  weight  and  feature  values  to  back  out  the  coefficient  value.  The  coefficient, 
together  with  the  desired  feature  value  (e.g.,  the  limit,  if  it  was  passed),  are  then  used  to 
recompute  the  weights. 

As  more  obstacles  are  added  to  the  field,  the  rules’  accuracy  is  affected. 
(Domain-specific  examples  are  given  in  the  relevant  chapters.)  They  remain,  however, 
useful  rules  of  thumb  for  guiding  weight  adjustment,  as  our  results  will  show. 

The  effects  of  changing  the  fuel/time  weight  ratio  or  the  energy/time  weight  ratio 
and  LIM  were  largely  independent.  This  allowed  us  to  decouple  them,  an  important  and 
useful  assumption.  They  are  not,  however,  entirely  independent.  As  LIM  decreases,  for 
example,  more  direct  paths  which  save  both  fuel  and  time  can  be  found.  The  effect  is  not 
dramatic,  but  can  mean  the  difference  between  a  successful  and  unsuccessful  solution.  If 
the  standard  WADJ  rules  have  failed  to  find  a  solution  that  mediates  between  competing 
time  and  fuel  goals,  a  secondary  WADJ  rule  will  change  LIM  in  an  attempt  to  take 
advantage  of  this  secondary  effect  and  find  a  successful  solution. 

We  were  concerned  that  the  6DOF  spacecraft  domain  with  nonlinear  dynamics 
would  not  be  amenable  to  this  WADJ  rule-generation  process.  Results  for  the  6DOF 
domain  were  in  fact  very  similar  to  those  for  the  2DOF  domain.  A  notable  difference 
was  the  torque  weight  tenn,  which  is  unsurprising  given  the  coupled  nature  of  the 
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rotational  and  translational  mechanics.  Our  torque  heuristic  is  discussed  in  detail  in 
Chapter  5. 

3.5  Implementation 

Initially,  the  TPLAN  and  FEXT  modules  were  implemented  in  MATLAB.  While 
MATLAB  runs  more  slowly  than  equivalent  code  implemented  in  C  or  C++,  this  gave  us 
ready  access  to  the  BVP4C  function  and  the  BVP4C2  code  and  domains  developed  in 
[51]  and  minimized  coding  overhead.  One  could  expect  significant  execution  speed 
increases  with  a  translation  to  C  or  C++,  making  this  work  quite  practical  for  complex 
dynamic  and  constraint  sets. 

INIT  and  EVAL  were  first  implemented  in  the  cognitive  architecture  ACT-R  [52], 
which  is  itself  implemented  in  Lisp  [53].  (WADJ  was  at  this  point  a  computational 
module  written  in  Lisp,  since  it  was  used  by  EVAL.)  Since  we  were  attempting  to  model 
a  decision  making  process  currently  handled  by  humans,  our  thinking  was  that  a 
“cognitive  model”  would  be  best  suited  for  the  task.  However,  the  code  which  resulted 
was  far  from  a  cognitive  model,  despite  its  implementation  in  a  cognitive  architecture 
[54].  Furthermore,  as  we  moved  into  the  nonlinear  6DOF  domain,  it  became  doubtful 
that  human  decision  making  would  even  be  something  that  we  would  want  to  emulate. 
Humans  are  very  adept  at  controlling  linear  systems,  but  our  physical  intuition  breaks 
down  for  nonlinear  ones.  As  the  reasons  for  using  a  cognitive  architecture  became  more 
uncertain,  INIT,  EVAL,  and  WADJ  were  all  migrated  to  MATLAB  since  this  offered  the 
certain  benefit  of  easier  integration.  Additionally,  the  implementation  of  iterative  loops 
required  for  constraint  checking  was  in  many  ways  simpler  and  more  obvious  in 
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MATLAB.  Mathematical  computations,  while  certainly  possible  in  Lisp,  were  also 
easier  to  represent  in  MATLAB. 

Implementing  these  routines  in  MATLAB  gave  further  insight  into  the  strengths 
of  the  cognitive  architecture.  ACT-R  excels  in  pattern-matching,  which  it  uses  to  find  the 
right  “procedural  knowledge”  (an  if-then  rule)  given  the  “declarative  knowledge”  (a 
piece  of  data)  and  the  goals  that  it  currently  has.  The  main  work  of  building  the  ACT-R 
model  was  finding  the  right  representation  of  the  declarative  knowledge  to  enable  general 
and  useful  procedures  that  were  not  all  simply  one  special  case  after  another. 

When  this  was  done  correctly,  it  made  extending  the  modules  very  easy.  To  add  a 
new  adverb,  for  instance,  the  user  had  only  to  write  declarative  chunks  for  the  features 
which  composed  the  adverb,  and  the  levels  at  which  those  features  were  present.  If  any 
new  features  were  defined,  their  relationship  to  the  weights  would  have  to  be  defined  as 
well,  but  that  would  also  only  be  a  small  set  of  “chunks”  of  declarative  knowledge.  The 
new  adverb  and  new  features  could  then  be  treated  just  like  all  of  the  pre-existing  adverbs 
and  features  by  the  system. 

In  MATLAB,  adding  a  new  adverb  requires  adding  new  cases  to  switch 
commands  in  several  subroutines.  The  actual  amount  of  infonnation  that  the  user  must 
add  is  not  much  more  than  in  the  ACT-R  case,  but  it  is  not  all  contained  in  one  localized 
area.  The  pattern-matching  capabilities  of  ACT-R  are  simply  more  elegant  than  those  in 
MATLAB  and,  when  dealing  with  symbolic  infonnation  such  as  words,  this  gives  ACT- 
R  a  distinct  advantage.  Also,  ACT-R  and  other  cognitive  architectures  have  learning 
functions  built  into  them,  which  would  make  it  easier  for  the  system  to  learn  individual 
users’  preferred  fuzzy  definitions.  (That  is,  if  the  user  always  responds,  “No,  faster!”  to 
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the  returned  trajectory  for  something  “very  fast,”  the  system  can  update  the  values  for 
“very  fast”  to  reflect  this.)  So,  while  the  cognitive  modeling  aspect  of  the  cognitive 
architecture  has  proven  to  not  be  very  important  to  this  work,  the  capabilities  of  the 
architecture  may  yet  be  useful  in  extending  and  generalizing  the  work. 
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4  Two  Degree  of  Freedom  Point  Rover 


A  simplified  2-D  domain  model  was  developed  as  an  intuitive  baseline  case  for 
our  architecture  and  as  a  method  of  developing  initial  modules  to  populate  the  Figure  4 
architecture. 

4.1  System  Dynamics 

We  began  our  investigations  with  a  2DOF  point-robot  model,  imagining  a  rover- 
like  robot  traveling  in  a  plane,  using  electric  motors  for  propulsion.  We  used  this  highly 
simplified  domain  to  gain  an  intuition  into  the  process  of  adjusting  the  cost  functional 
weights  and  computing,  then  evaluating,  the  resulting  trajectories.  The  model  has  simple 
linear  dynamics: 

x'(t) 
x"(t) 

where  m  is  object  mass  and  cs  is  the  coefficient  of  sliding  friction.  We  assume  an 
idealized  system  without  motor  saturation  and  perfect  trajectory  tracking. 

4.2  Terms  of  the  Cost  Functional 

In  robotic  applications,  two  concerns  are  usually  paramount:  conserving  fuel  or 
battery  power  and  not  running  into  obstacles.  Additionally,  there  may  be  time  constraints 
on  a  mission.  Equation  (4.2)  gives  the  cost  functional  J  and  weight  vector 
Ql  =  [Wj,  W 2,  W 3,  LIM] .  Each  term  is  described  more  fully  below. 
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4.2.1  Energy  Use 

Since  we  are  considering  a  hypothetically  battery-powered  vehicle,  we  followed 
[46]  in  adding  a  minimum-energy  term.  We  have  simplified  his  representation  somewhat; 
the  author  of  includes  a  potentially  different  weight  for  each  uj  (t)  in  the  control  vector. 
We  do,  however,  differentiate  control  vector  elements  in  the  6-DOF  example  shown 
below,  where  translational  actuators  require  fuel  and  rotational  actuators  require  electrical 
power. 

4.2.2  Time 

Since  J  is  an  integral,  the  cost  functional  only  needs  a  constant  tenn,  W2,  to 
minimize  time.  Over  the  integral,  the  resulting  W 2* tfW\W  be  minimized. 

4.2.3  Clearance  to  Obstacles 

To  keep  the  vehicle  away  from  obstacles,  we  add  the  tenn  W3  *  ^o;.(r), 

<e{0} 

presuming  simple  circular  obstacle  geometries.  (A  sum  is  used,  rather  than  a  maximum, 
to  preserve  the  smoothness  criterion  required  for  convergence  in  the  collocation  solver.) 
We  assume  that  our  agent  has  an  a  priori  map  of  the  region  it  will  traverse,  possibly 
obtained  from  an  orbiter  or  aerial  overflight.  is  a  function  which  increasingly 
penalizes  the  agent  as  it  approaches  obstacle  i.  Wj  is  the  relative  weight  in  overall  cost 
from  Equation  4.2,  and  rt  is  the  distance  from  the  vehicle  to  obstacle  V s  center: 

n  =  sqrt(  (x  -  x0)2  +  ( y  - y0f  +  (z-z0)2 )  (4.3) 
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Oj(rj)  is  maximum  value  MAX  over  the  center  of  the  obstacle,  attains  fixed  value  K  at  the 
obstacle  boundary  a  distance  R,  from  the  obstacle  center,  and  decreases  to  zero  at  a 
distance  LIM  away  from  the  obstacle's  edge  (Figure  2).  These  constraints  are  described 
by  Equation  4.4  and  also  include  smoothness  conditions.  A  third-order  polynomial 
solution  (Equation  4.5)  that  meets  these  constraints  was  selected  as  o/rj).  This  solution  is 
positive  within  the  region  of  influence  (r,  <LIM )  and  effectively  repels  the  path  given 
sufficient  MAX,  K,  LIM  values. 
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0,  r,  >  LIM 


(4.5) 


MAX  and  K  were  chosen  in  an  ad  hoc  fashion,  after  some  experimentation.  They 
should  be  sufficiently  large  compared  to  the  other  costs  to  appear  nearly  infinite;  Eq.  4.5 
is  equal  to  K  when  the  vehicle  is  touching  the  obstacle,  and  we  would  like  to  model  any 
further  passage  into  the  obstacle  as  being  of  infinite  cost.  (In  practice,  if  the  path  does  go 
within  the  obstacle,  this  will  violate  an  implicit  hard  min-sep  limit  and  cause  a 
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recalculation  of  the  trajectory.)  The  coefficients  c  are  found  by  solving  the  two  third- 
order  equations  and  their  first  and  second  derivatives  to  satisfy  all  of  the  smoothness 
constraints. 

In  our  analysis  of  the  effects  of  changing  the  weighting  parameters  on  the 
trajectory,  we  included  changes  not  only  in  W3  but  also  in  LIM.  They  had  distinctly 
different  effects,  as  show  in  Figure  9  above. 

In  addition  to  these  terms,  the  system  dynamic  equations  are  adjoined  to  ,/as 
shown  in  Eq.  2.9.  This  ensures  that  only  dynamically  feasible  trajectories  will  be 
considered. 

4.3  Development  of  Weight  Adjustment  Heuristics  and  Fuzzy  Rules 

4.3.1  Weight  Adjustment  Heuristics 

The  procedure  outlined  in  Chapter  3  was  followed,  with  WADJ  rules  and  fuzzy 
correlation  rules  established  for  all  of  the  features  addressed  in  the  current 
implementation  of  the  architecture.  The  general  procedure  was  very  similar  for  each,  so 
one  detailed  example  is  presented  here,  while  the  rest  of  the  WADJ  plots  can  be  found  in 
Appendix  C. 

Figure  10  shows  the  data  collected  for  average  speed  as  a  function  of  the  ratio  of 
the  energy  weight,  Wi,  to  the  time  weight,  W2.  In  the  empty  10m  x  10m  field,  this  linear 
system  shows  very  predictable  behavior,  described  well  by  the  Figure  10  equation  plotted 
in  black.  As  obstacles  are  added  to  the  field,  it  becomes  clear  that,  while  the  equation 
still  has  value  as  a  heuristic,  it  is  not  as  accurate  in  predicting  the  average  speed  for  a 
given  set  of  weights  as  it  is  in  the  empty  field  case. 
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♦  0  obstacles 
■  1  obstacle 
3  obstacles 

- Power  (0  obstacles) 


W1/W2 


avg-speed  io  =  1. 12(1/1/,  /W2)'0'38 
R2  =  0.9914 


Figure  10:  Average  speed  as  a  function  of  W/fV2  for  zero,  one,  and  three  obstacles  in  a  10m  x  10m 

field 


Figure  1 1  shows  the  change  in  the  equation  linking  average  speed  to  the  W1/W2 
ratio  as  field  size  changes.  200m  x  200m,  20m  x  20m.  and  2m  x  2m  fields  were  used, 
and  the  10m  x  10m  data  from  the  Figure  10  zero  obstacle  case  is  included  as  well.  There 
is  variation  in  the  exponent,  particularly  between  the  mid-range  and  high  field  sizes  and 
the  small  field.  In  practice,  we  would  not  expect  to  operate  frequently  in  such  a  small 
field  (unless  the  robot  was  exceptionally  small,  in  which  case  we  would  probably  not  be 
operating  in  the  larger  regimes). 


55 


Figure  11:  Average  speed  as  a  function  of  W/W2  for  four  different  empty  field  sizes 


Since  the  equations  serve  only  as  a  heuristic  and  become  less  accurate  predictors 
in  cluttered  environments  (Figure  12),  we  did  not  want  to  impart  too  much  precision  to 
the  exponent.  An  average  of  all  of  the  exponents  yields  a  value  of  -0.41;  the  average  of 
the  three  largest  fields  yields  a  value  of  -0.44.  Both  of  these  can  be  rounded  to  a  single 
decimal  place  as  -0.4,  which  is  what  was  done.  This  resulted  in  the  WADJ  equation 
relating  average  speed  with  Wi/Wg 

avg-speed  =  ci(Wi/W2)'0A  (4.5) 

The  constant  a  is  calculated  from  actual  values  of  avg-speed  and  Wi/W 2  at  runtime. 
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♦  0  obstacles 
■  1  obstacle 
3  obstacles 

- Power  (0  obstacles) 


W1/W2 


avg-speed  io  =  1. 12(1/1/,  /W2)'0'38 
R2  =  0.9914 


Figure  12:  Heuristic  equation  less  predictive  as  obstacles  are  added  to  the  environment 


4.3.2  Heuristic  Equation  Verification 

To  test  our  procedure,  we  compared  it  to  analytical  results.  While  there  is  no 
analytical  solution  for  a  cost  functional  which  includes  the  obstacle  penalty  function,  one 
can  be  derived  for  the  zero  obstacle  case,  where  only  energy  and  time  are  traded  off.  For 
greatest  simplicity,  we  examined  a  system  without  friction  and  only  along  one  axis  (valid 
because  the  components  are  independent  of  each  other).  System  dynamics  are  given  in 
Equation  4.6. 


x'(t) 

"0  fi 

x(t) 

_|_ 

0 

°  0 

_x'(t)_ 

i 

u(t)  /  m 

Our  simplified  cost  functional  is  given  in  Equation  4.7.  As  in  our  implementation,  we 
normalized  by  the  time  weight. 
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(4.7) 


A 


Wu2  + 1 


d  t 
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To  this  we  appended  the  system  dynamics.  The  interior  of  that  integral  gives  us 
our  Hamiltonian,  as  described  in  Equation  2.10.  Computing  the  Euler-Lagrange 
Equations  (Eq.  2. 1  la-c)  yields  four  linear  ordinary  differential  equations  with  four 
unknown  coefficients.  Final  time  is  also  unknown.  We  use  known  initial  and  final  states 
to  solve  for  four  of  the  unknowns  and  the  transversality  condition  to  get  the  fifth. 

Solving  the  system  for  tf  in  terms  of  final  position,  xj,  and  weight  W  yields: 

tf  =  (4.8) 


We  ran  our  solver  for  a  range  of  W  for  x/=  10  m  and  compared  them  to  the  solution  given 
in  Equation  4.8.  The  results  are  precise  to  ±0.005  seconds.  The  results  are  summarized 
in  Table  1.  The  numbers  agree  to  within  our  precision. 


w 

tf  computed 

oo 

ii 

0.125 

3.875 

3.873 

0.250 

4.609 

4.606 

0.500 

5.473 

5.477 

1.000 

6.518 

6.514 

2.000 

7.748 

7.746 

4.000 

9.207 

9.211 

8.000 

10.953 

10.954 

Table  1:  Verifying  WADJ  heuristics  with  analytical  predictions 


58 


We  note  that  our  BVP4C-based  solver  is  not  using  BVP4C’s  ability  to  solve  for  a  free 
final  time.  That  proved  too  sensitive  to  initial  estimates  to  be  of  use.  Instead,  we  search 
with  increasing  granularity  over  a  range  of  known  final  times,  narrowing  our  search  in 
the  low-cost  region.  This  method  is  vulnerable  to  local  minima  in  cases  where  the  cost 
functional  results  are  not  smooth,  as  they  may  be  in  the  presence  of  many  obstacles. 

4.3.3  Fuzzy  Rule  Database  Z 

The  fuzzy  logic  portion  of  the  initialization  routine  required  three  different  sets  of 
rules  or  correlations  to  be  made:  weight  values  to  fuzzy  levels,  feature  values  to  fuzzy 
levels,  and  fuzzy  feature  levels  to  fuzzy  weight  levels. 

The  data  in  Figure  10  and  Figure  1 1  was  generated  using  Wi/W 2  ratios  that  were 
powers  of  two:  2'  ,  2'“,  ...  2,2.  Before  the  fuzzy  logic  portion  of  this  research  was  even 
fully  expressed,  these  numbers  were  chosen  because  they  appeared  to  span  the  range 
from  “very  low”  to  “very  high”  relative  weights.  It  seemed  reasonable,  then,  to  formally 
assign  these  values  to  the  “fuzzy  triangles”  that  relate  weight  values  to  their  fuzzy  levels. 
The  “medium”  W1/W2  triangle  (as  in  Figure  3),  for  example,  is  100%  medium  at  2°,  and 
0%  medium  at  2'1  and  21. 

The  assignment  of  feature  values  to  fuzzy  levels  is  a  matter  of  judgment.  In 
industry,  teams  of  scientists  expend  significant  effort  to  ensure  that  a  washing  machine’s 
definition  of  “very  clean”  is  as  close  to  the  definition  used  by  the  majority  of  the  target 
consumers  as  they  can  make  it  [2],  Without  a  real  robot  or  a  real  consumer  (paying 
customer)  to  provide  feedback  for  this  problem,  we  have  made  our  own  judgments  based 
on  the  Figure  10  data. 


59 


As  a  starting  point,  we  correlated  the  feature  values  at  the  weight  values  to  fuzzy 
levels.  That  is,  if  W1/W2  is  “medium”  between  2'1  and  21,  we  looked  at  the  feature  values 
in  Figure  10  that  resulted  from  those  weight  values.  If  it  seemed  reasonable  to  call  those 
feature  values  “medium,”  we  did  so.  If,  because  of  rapid  changes  in  the  feature  value 
curve,  that  interval  clearly  did  not  define  a  single  interval,  we  looked  for  the  nearest 
values  that  could  reasonably  define  it. 

The  fuzzy  relationships  fell  into  two  categories:  directly  and  inversely  related.  In 
directly  related  cases,  “very  low”  weight  values  resulted  in  “very  low”  feature  values; 
“low”  weight  values  resulted  in  “low”  feature  values,  and  so  on.  In  inversely  related 
cases,  “very  low”  weight  values  resulted  in  “very  high”  feature  values. 

Although  this  implementation  of  fuzzy  logic  is  not  highly  sophisticated,  it  was 
sufficient  to  generate  improved  estimates  of  initial  weight  sets  for  our  optimization 
processes.  The  full  set  of  fuzzy  rules  is  found  in  Appendix  C. 

4.4  Results 

To  evaluate  the  perfonnance  of  our  system  for  the  2D  robot  domain,  five  different 
logical  sets  of  constraints  L°  of  varied  complexity  were  enforced  on  four  different 
obstacle  fields  {O}  for  a  total  of  twenty  trials.  There  were,  overall,  28  hard  limits  (H°) 
and  72  soft  limits  (S°).  The  simplest  constraint  set  enforced  one  hard  and  two  soft 
constraints;  the  most  extensive  had  two  hard  constraints  and  six  soft  constraints. 
Appendix  B  fully  details  the  constraint  and  obstacle  sets  used  for  the  test  cases. 

Each  of  the  twenty  test  cases  was  run  from  a  default  weight  vector 
&  default  =  [1,  1,  1,  1]  and  from  a  Q1  provided  by  I  NIT.  Only  the  collocation  TPLAN 
BVP4C  was  used  for  trajectory  generation.  The  results  after  one  iteration  (labeled 
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Default1  and  INIT1)  and  after  program  completion  ( Default n  and  INIT ')  were  examined 
for  both  starting  weight  vectors. 

Before  looking  at  the  overall  results,  we  present  a  detailed  walk-through  of  two 
particular  solutions  to  give  the  reader  a  clearer  picture  of  how  the  EVAL  and  WADJ 
processes  work,  and  how  they  impact  the  generated  trajectories.  The  first  example  shows 
the  routine  working  smoothly.  The  second  example  shows  a  partial  success  even  in  the 
face  of  some  unexpected  behavior. 

4.4.1  Detailed  Examples 

Constraint  Set  3  was  self-sabotaging.  It  asked  for  trajectories  with  S°  =  {a  little 
quickly,  exceedingly  inquisitively} .  These  expanded  into  high  avg-speed  and  max-speed 
and  medium-low  avg-speed  and  low  min-sep,  respectively.  It  would  not  be  possible  to 
satisfy  both  high  avg-speed  and  medium  low  avg-speed.  The  idea,  of  course,  was  to 
create  a  trajectory  was  that  mostly  inquisitive  but  on  the  fast  end  of  that.  The  returned 
trajectories  for  Obstacle  Sets  2  and  3  achieved  exactly  that;  Obstacle  Sets  1  and  4 
converged  to  a  solution  that  favored  “quickly.”  (User  preference  infonnation  strength,  as 
described  by  the  adverbial  modifiers  “a  little”  and  “exceedingly,”  is  lost  after 
initialization  in  the  current  version  of  the  architecture.)  We  will  look  at  the  result  for 
Obstacle  Set  1 . 

INIT  returned  initial  weights  weights  Q1  =  [0.954,  1.00,  1.00,  0.50].  (Recall  that 
Wi  is  the  energy  weight,  W 2  is  the  time  weight,  W 3  is  the  weight  on  the  obstacle  penalty 
function  tenn,  and  W 4  is  LIM,  the  obstacle  penalty  function  influence  limit.)  We 
nonnalized  the  weights  by  W2,  the  time  weight,  in  the  2DOF  case.  The  result  was  not  too 
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different  from  the  default  weight  vector  [1,  1,  1,  1],  although  LIM  was  half  its  default, 
anticipating  the  desire  for  “medium-low  min-sep ”. 

Figure  13  and  Figure  14  show  the  results  for  all  the  trajectories;  the  paths  in 
particular  were  so  similar  that  it  is  necessary  to  graph  them  all  together  to  detect 
differences.  The  min-sep  requirements  were  met  on  all  paths.  Using  the  Q1  weights, 
max-speed  was  1.44  m/s  and  avg-speed  was  1.22  m/s.  For  “inquisitively,” 

0.67  m/s  <  avg-speed  <  1.33  m/s,  so  this  soft  constraint  was  satisfied.  It  was  of  course 
too  slow  for  “quickly,  whose  “high  avg-speed”  required  2.67  m/s  <  avg-speed  <  5.33  m/s, 
and  “high  max-speed ”  required  4  m/s  <  max-speed  <  8  m/s.  Q2  was  generated  using  the 
larger  failure,  on  max-speed,  giving  Q2  =  [0.07,  1.00,  1.00,  0.50]. 

Such  a  low  W1/W2  ratio  indicated  that  time  should  be  excessively  optimized,  and 
that  was  exactly  what  happened,  max-speed  was  raised  to  4. 17  m/s,  just  at  the  low  end  of 
satisfying  “quickly.”  avg-speed  was  raised  to  3.21  m/s,  also  satisfying  “quickly.”  By 
necessity,  avg-speed  now  violated  the  “medium  low  avg-speed ”  constraint  within 
“inquisitively,”  exceeding  it  by  1.87  m/s.  The  speeds  were  too  fast,  so  the  W1/W2  ratio 
must  be  raised.  Q3  was  calculated  as  [0.67,  1.00,  1.00,  0.50]. 

This  result  was  very  similar  to  the  Iteration  1  results.  It  was  a  little  faster,  so  max- 
speed  and  avg-speed  failed  their  “quickly”  requirements  by  less  this  time,  2.44  m/s  and 
1 .34  m/s  undershoots  respectively.  These  errors  fed  into  the  weight  adjustment,  giving 
Q4  =  [0.06,  1.00,  1.00,  0.50]  -  almost  the  same  as  Q2 .  Indeed,  the  Q4  results  were  close  to 
the  Q2  results,  avg-speed  was  3.42  m/s,  2.09  m/s  too  fast  for  “inquisitively.”  max-speed 
was  4.52  m/s;  both  of  these  values  were  on  the  lower  ends  of  the  ranges  for  “quickly,”  as 
we  would  hope  that  they  would  be  as  they  moved  toward  some  compromise  with 
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“inquisitively.”  But  the  compromise  ends  here;  Q4  was  too  similar  to  Q2 .  When  the 
weights  were  readjusted,  the  results  were  too  close  to  it  (the  threshold  was  set  at  0.01). 
The  process  tenninated  and  returned  the  best  identified  trajectory. 

Constraint  Set  1  consisted  of  one  hard  constraint,  H°  =  {max-speed  <  4.2  m/s}, 
and  one  soft  constraint,  S°  =  (“somewhat  quickly”} .  “Quickly”  was  defined  in  the  fuzzy 
language  database  V  as  correlating  to  the  state  features  max-speed  and  avg-speed,  both  at 
level  “high.”  The  fuzzy  rules  database  Z  defined,  for  this  particular  domain  with  this 
particular  simulated  robot,  “high  max-speed”  to  be  between  4  and  8  m/s;  “high  avg- 
speed ”  is  similarly  between  2.67  and  5.33  m/s.  So  to  meet  both  H°  and  S°,  the  trajectory 
planner  must  find  a  solution  with  a  max-speed  greater  than  4  m/s  (the  lower  limit  for 
“high  max-speed ”)  but  below  4.2  m/s  (the  hard  constraint).  Putting  the  max-speed  in  that 
0.2  m/s  window  proved  difficult;  only  one  case  out  of  eight  managed  it.  We  will  look  at 
one  where  it  did  not  fully  succeed,  to  see  how  the  trade-offs  were  being  made,  and  why 
the  success  was  only  partial. 

/ATT,  given  the  constraints  L°  above,  returned  an  initial  set  of  weights 
Q1  =  [0.214,  1.00,  1.00,  1.00].  So  we  should  read  this  as  saying  minimizing  time  is 
roughly  five  times  more  important  than  minimizing  energy.  This  generated  the  path  and 
trajectory  seen  in  Figure  15  and  Figure  16,  below.  Although  it  met  both  S°,  the  max- 
speed  was  4.30  m/s,  failing  H°  by  0. 10  m/s.  WADJ used  this  error  to  compute  a  new 
W1/W2  ratio.  The  new  Wi  was  0.273  relative  to  a  W2  of  1.0.  (W3  and  W4  were  unchanged 
since  we  assume  the  time-dependent  properties  are  independent  of  path  properties.)  By 
raising  Wi/W 2  slightly,  we  hoped  to  elicit  a  slightly  slower,  more  energy-efficient 
trajectory  -  one  that  has  a  max-speed  below  4.2  m/s. 
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Figure  13:  2DOF  paths  for  CS  3,  OS  1 
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Figure  14:  2DOF  path  trajectories  for  CS  3,  OS  1 


Path  for  CS  1 ,  OS  2,  Iter  1 
W1/W2  =  0.21,  LIM  =  1.00 
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Figure  15:  2DOF  path  for  Q1  =  [0.214, 1.0, 1.0, 1.0] 


Trajectory  for  CS  1,  OS  2,  Iter  1 
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Figure  16:  2DOF  trajectory  for  Q1  =  [0.214,  1.0, 1.0, 1.0] 


Figure  17  and  Figure  18  show  the  results  from  the  trajectory  planner  using  Q2 .  It 
had  in  this  case  converged  to  a  path  quite  different  from  the  prior  iteration.  Although  this 
path  was  longer,  the  agent  took  more  than  three  times  as  long  to  traverse  it.  This  kept  the 
energy  consumption  and  the  speeds  down.  The  max-speed  here  was  only  1.26  m/s, 
meeting  H°  easily.  Of  course,  this  was  too  slow  to  satisfy  S°,  failing  the  max-speed 
requirement  by  2.74  m/s  and  the  avg-speed  requirement  by  1.61  m/s.  But  we  had  a 
partial  success,  since  H°  is  met,  so  this  trajectory  was  stored  as  the  best  trajectory  so  far. 
Then  we  entered  the  S°  satisfaction  loop  shown  in  Figure  5;  timejimit  is  set  to  five 
iterations.  Attempts  were  then  made  to  improve  the  solution  to  better  meet  soft 
constraints  S°  while  still  meeting  hard  constraints.  New  Q3  were  computed;  the  W1/W2 
ratio  was  0.015,  smaller  than  it  was  in  Q1  because  of  the  rather  large  amount  by  which  the 
soft  max-speed  constraint  failed.  EVAL  does  not  yet  have  the  sophistication  to  check  for 
more  than  loops  in  the  weights,  so  it  did  not  notice  that  this  was  likely  to  be  a  bad  choice 
of  weights. 
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Path  for  CS  1 ,  OS  2,  Iter  2 
W1/W2  =  0.27,  LIM  =  1.00 


Figure  17:  2DOF  path  for  Q2  =  [0.214, 1.0, 1.0, 1.0] 
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Figure  18:  2DOF  trajectory  for  Q2  =  |0.214, 1.0, 1.0, 1.0] 


Figure  19  and  Figure  20  show  the  results  of  using  Q3  to  generate  the  trajectory. 
The  path  that  had  been  found  has  the  same  shape  as  in  Iteration  1 ,  but  the  time  to  traverse 
it  had  decreased  even  more.  Just  as  in  Iteration  1,  this  was  too  fast  for  H°,  so  WADJ 
backed  off  from  the  current  value  of  W1/W2.  Recall  Equation  4.5,  which  relates  the 
desired  feature  value  to  W1/W2.  There  is  a  constant  coefficient,  c/,  in  that  equation,  which 
can  vary  strongly  according  to  field  size.  Rather  than  try  to  maintain  a  rigid  table  of  c; 
values,  we  compute  it  online  during  every  WADJ  from  the  current  W1/W2  value  and  the 
current  feature  value  -  max-speed,  in  this  case.  This  means  that,  when  we  applied  the 
same  rule  that  we  did  after  Iteration  1,  we  do  not  get  the  same  new  Wi/W 2  ratio  that  we 
obtained  previously,  because  ci  had  changed,  max-speed  for  this  iteration  was  6.79  m/s, 
since  2.50  m/s  too  fast.  Using  this  to  adjust  the  weight  ratio  to  a  value  that  better 
supported  slow  speeds,  we  obtained  Q4  =  [0.050,  1.00,  1.00,  1.00]  and  ran  the  trajectory 
planner  again. 
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Path  for  CS  1 ,  OS  2,  Iter  3 
W1/W2  =  0.01,  LIM  =  1.00 


Figure  19:  2DOF  path  for  Q3  =  [0.015, 1.0, 1.0, 1.0] 
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Figure  20:  2DOF  trajectory  for  Q3  =  [0.015, 1.0, 1.0, 1.0] 


Despite  a  very  low  W1/W2  ratio  (time  is  20  times  more  important  to  minimize  than 
fuel)  we  obtained  a  trajectory  much  like  that  in  Iteration  2  (where  W1/W2  =  0.274).  The 
path  in  this  case  (Figure  21)  was  shorter  and  more  direct  than  in  Iteration  2,  but  the  time 
to  traverse  it  was  comparable  (Figure  22).  This  was  a  curious  result;  we  would  expect 
from  the  IVADJ  heuristics  a  trajectory  generated  with  such  a  low  W1/W2  to  have  higher 
speeds.  However,  sometimes  the  weighted  sum  approach  that  we  use  for  our  cost 
functional  can,  for  different  weights,  return  the  same  result  [41].  Another  possibility  is 
that  this  is  a  local  minima,  if  the  search  over  the  time  domain  of  trajectories  missed  the 
low-time  valley  containing  the  expected  solution.  Our  lowest  allowable  time  for  a 
trajectory  was  1  second.  Since  our  previous  fast  trajectories  had  final  times  on  the  order 
of  3  -  5  seconds,  it  is  possible  that  the  granularity  of  the  final  time  matrix  was  too  coarse 
thus  missed  this  solution. 

The  max-speed  in  this  case  was  only  1.11  m/s,  meeting  H°  (as  WADJ  was 
attempting).  As  before,  this  failed  the  S°  by  fair  margins  (2.89  m/s  too  slow  to  make  the 
low  end  of  “high  max-speed).  The  W1/W2  weight  ratio  was  again  reduced,  this  time  to  the 
very  small  value  of  0.002.  EVAL  checks  for  loops  when  weights  are  “equivalent,”  that  is, 
within  some  preset  threshold  of  each  other.  In  our  tests,  this  threshold  was  set  to  0.01,  so 
this  new  ratio  was  considered  to  cover  the  region  between  0-0.01. 
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Trajectory  for  CS  1,  OS  2,  Iter  4 


Path  for  CS  1,  OS  2,  Iter  4 
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Figure  21:  2DOF  path  for  Q4  = 

|0.050, 1.0,  1.0, 1.0] 
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Figure  22:  2DOF  trajectory  for  ii4  =  [0.050, 1.0,  1.0,  1.0] 


Iteration  5  returned  unusual  results.  Again  despite  the  very  low  Wj/Wj  value,  we 
were  near  the  same  final  time  as  in  Iteration  4.  The  path  shape  (Figure  23)  was  back  to 
something  similar  to  that  found  for  Iteration  1 ,  but  it  traversed  more  widely  around  the 
obstacles.  At  this  point,  the  energy  term  had  been  so  heavily  discounted  that  the  obstacle 
penalty  term  is  exerting  a  stronger  influence  on  the  solution.  Figure  24  shows  the 
trajectory  information,  which  was  not  at  all  as  “quickly”  as  we  might  have  liked.  H°was 
still  met  {max-speed  =  1.70  m/s)  for  this  case,  but  that  was  too  low  for  S°.  Perhaps  we 
were  in  the  same  local  minima?  WADJ  tried  again  for  new  weights  and  returned  a  still- 
smaller  W1/W2  value,  0.0002  -  but  that  is,  to  our  level  of  precision,  the  same  as  0.002. 

The  weights  had  made  a  loop,  and  the  best  trajectory  was  returned.  Trajectory  1  and  3 
failed  H°  and  so  are  clearly  not  acceptable.  Trajectory  2  failed  max-speed  by  2.74  m/s 
and  avg-speed  by  1.60  m/s;  Trajectory  4  failed  them  by  2.89  and  1.73  m/s,  respectively; 
Trajectory  5  was  ultimately  returned  as  the  best  trajectory  even  though  it  failed  max- 
speed  by  2.3 1  m/s  and  avg-speed  by  1 .26  m/s. 

What  was  the  problem  (beyond  competing  hard  and  soft  constraints)?  The  initial 
guess  got  us  quite  close  to  a  good  solution,  and  the  adjustment  to  meet  H°  seemed  like  a 
reasonably  small  change  in  the  weights.  But  the  convergence  to  the  wildly  different  path 
shape  and  its  very  low  speeds  put  us  into  a  WADJ  cycle  around  extreme  weight  values. 
The  trajectory  planner’s  search  over  the  free  final  time  parameter  was  at  a  fixed  starting 
granularity,  which  may  have  been  unsuitable  for  this  problem  may  have  resulted  in 
repeated  convergence  to  a  local  minima.  Our  technique  is,  as  all  numerical  optimizer  are, 
vulnerable  to  these  problems. 
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Path  for  CS  1 ,  OS  2,  Iter  5 
W1/W2  =  0.00,  LIM  =  1.00 
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Figure  23:  2DOF  path  for  Qs  =  [0.002,  1.0, 1.0,  1.0] 
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Figure  24:  2DOF  trajectory  for  iis  =  [0.002,  1.0,  1.0,  1.0] 


4.4.2  Overall  2DOF  Results 


Figure  25  shows  the  total  number  of  failures  for  each  of  the  four  cases  Default1 , 
INIT1,  Default’ \  and  INIT’.  INIT1  shows  a  clear  advantage  over  Default1 ,  with  both  fewer 
failures  to  meet  H°  and  S°.  We  are  not  in  this  work  formally  working  within  an  anytime 
planning  framework,  but  this  significant  improvement  in  solution  quality  for  the  first 
iteration  would  be  of  benefit  should  we  extend  the  work  in  that  direction. 


Defaultl  INIT  1  Defaultn  INIT  n 

Case 

Figure  25:  Failures  for  each  2DOF  solution  case  out  of  28  H°  and  72  S° 

Final  results  are  not  nearly  as  dramatically  different  as  initial  results.  Our 
different  starting  points  in  these  cases  did  not,  after  repeated  applications  of  WADJ,  result 
in  significant  differences  in  final  solution.  Flowever,  INIT"  converged  to  an  acceptable 
solution  in,  on  average,  5.25  iterations.  Three  times,  only  one  iteration  was  required,  and 
the  maximum  number  of  iterations  was  13.  Default ”  required  on  average  6.10  iterations; 
it  never  found  an  acceptable  trajectory  on  the  first  try,  but  its  maximum  number  of 


□  SO  failures 
■  HO  failures 
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iterations  was  only  9.  Figure  26  shows  a  histogram  of  the  number  of  iterations  each 
solution  required  before  returning.  Overall,  IN  IT  has  more  returns  with  fewer  iterations 
than  Default 
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Figure  26:  2DOF  iterations  through  architecture 

The  effects  of  obstacle  arrangement  did  not  have  a  strong  affect  on  the  number  of 
S°  failures,  as  show  in  Figure  27  below.  Except  for  INIT  in  Obstacle  Set  4,  all  final 
solutions  had  between  44%  and  56%  S0  failure  rates  when  grouped  by  obstacle  field.  We 
are  pleased  to  see  that  solution  quality  is  not  greatly  affected  by  obstacles. 


□  Defaultn 

□  INITn 
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Figure  27:  2DOF  S°  failure  rates  by  obstacle  set 


Failure  rates  in  S°  were  impacted  by  the  constraint  set,  as  might  be  expected. 
Figure  28  gives  the  failure  rates.  Constraint  Set  1  and  4  had  the  overall  highest  rates.  In 
Constraint  Set  1,  a  hard  limit  required  a  max-speed  of  less  than  4.2  m/s,  while  a  fuzzy 
user  preference  for  “very  quickly”  was  also  expressed.  “Very  quickly”  was  expanded  to 
“high  max-speed ”  and  “high  avg-speed”  which  defuzzified  into  4.0  m/s  <  max-speed  < 
8.0  m/s  and  2.67  m/s  <  avg-speed <  5.33  m/s.  So  we  were  forcing  the  system  response 
into  the  very  low  end  of  “high  max-speed ”  to  meet  the  H°.  In  many  cases,  it  undershot 
max-speed  (and  sometimes  avg-speed  as  well)  in  satisfying  H°  and  subsequent 
applications  of  the  WADJ  heuristics  were,  in  the  presence  of  obstacles,  not  precise 
enough  to  hit  the  0.2  m/s  window  that  would  satisfy  both  max-speed  constraints  exactly. 
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Figure  28:  2DOF  S°  failure  rates  by  constraint  set 


The  Constraint  Set  4  failure  rates  may  reflect  a  weakness  in  the  WADJ  rule 
generation  method.  Constraint  Set  4  included  two  Hq  upper  constraints  on  max-acc  and 
max-speed,  and  then  S°  numeric  range  constraints  on  energy  and  avg-speed.  Except  for 
Obstacle  Set  4  (which  seems  to  be  the  easier  obstacle  set  in  Figure  27),  the  energy  range 
was  uniformly  failed.  Maneuvering  around  obstacles  may  take  more  energy  than 
predicted  by  the  fuzzy  rules  gotten  from  the  WADJ  curve  data  generated  in  empty  space 
or  in  a  field  with  only  one  obstacle. 

Constraint  Set  2  was  the  most  successful.  It  combined  a  hard  upper  limit  on  max- 
acc  and  a  hard  lower  limit  on  min-sep  ( min-sep  >  1.7  m)  with  the  fuzzy  constraint 
“safely,”  which  entailed  “high  min-sep ,”  and  “low”  max-speed,  avg-speed,  and  max-acc. 
Only  the  “high  min-sep ”  was  failed  in  Default "  and  IN  IT1 ;  it  defuzzifies  to  3.0  m  <  mi  ti¬ 
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sep  <5.0  m.  (Note  that  this  is  the  soft  limit  on  min-sep ;  the  hard  limit  of  min-sep  >  1.7  m 
was  met  in  all  final  cases.)  min-sep  that  high  may  have  forced  paths  that  detoured  very 
widely  around  the  obstacles  and  the  extra  time  required  to  traverse  them  may  have  made 
the  trajectories  too  costly.  While  the  time-based  parameters  were  certainly  discounted  in 
these  trajectories,  they  are  not  entirely  ignored;  “low”  speed  does  include  a  fuzzy  lower 
bound.  There  is  a  category  of  speed  variables  slower  than  “low,”  “very  low,”  and  the 
EVAL  module  works  on  the  assumption  that  the  user  does  not  want  something  lower  than 
low  -  slow,  but  not  that  slow. 

Out  of  all  40  test  cases  run  to  completion,  5  were  able  to  meet  all  H°  and  S°.  Both 
the  Default "  and  INIT 1  solutions  for  Constraint  Set  2/Obstacle  Set  3  and  Constraint  Set 
4/Obstacle  Set  4  were  successful.  IN  IT1  also  met  all  limits  on  Constraint  Set  1/Obstacle 
Set  4.  It  is  interesting  that  even  though  Constraint  Sets  1  and  4  had  the  highest  S°  failure 
rates  overall,  they  also  contained  examples  of  fully  successful  trajectories. 

In  addition  to  the  number  of  failures  and  successes,  we  were  interested  in  the 
overall  solution  quality.  Since  H°  and  S°  represent  two  different  things,  they  were  treated 
separately.  Our  assumption  is  that  H°  represents  boundaries  on  state  regions  into  which 
the  vehicle  may  not  go.  They  do  not  describe  any  preference  by  the  user  to  be  at  or  near 
the  limit;  in  fact,  we  assume  the  opposite,  that  being  well  away  from  the  hard  limits  is 
preferable.  To  this  end,  we  define  the  hard  margin  of  success  to  be: 


mar  gin' H  . 


jabsOF'-Hj.j/H;,  T!oH; 
undefined,  T'— i°H" 


(4.9) 


That  is,  if  the  jth  element  of  the  ith  feature  vector  F“  in  the  set  of  feature  vectors  F  is 
constrained  by  a  /th  element  in  H°  and,  further,  meets  that  constraint,  margin  hj  has  a 
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value.  The  larger  margin1 Hj  (which,  dropping  the  indices,  we  refer  to  now  as  margin h)  is, 
the  farther  from  the  hard  limit  H°  the  relevant  trajectory  feature  is.  Large  values  of 
margitiH  are  desirable.  (We  do  not  define  a  similar  margin  of  failure  because  there  were 
no  H°  failures  in  the  data  sets  run  to  completion.)  Figure  29  shows  the  histogram  for 
values  of  marginH  over  this  data  set. 
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Figure  29:  Margin  of  success  for  hard  limits  in  2DOF  case 


This  data  suggests  the  initial  choice  of  weighting  parameters  has  little  effect  on  the  final 
solution  quality.  Both  INIT'  and  Default '  achieved  8  instances  of  marginH>  0.5, 
meaning  that  the  solution  feature  was  less  than  50%  of  the  hard  limit.  !NITn  had  more 
instances  of  meeting  the  H°  by  less  than  10%  than  Default'1 . 

Soft  limits  S°,  on  the  other  hand,  represent  those  areas  of  the  state  space  where  the 
user  prefers  that  the  vehicle  operate.  Those  that  arise  from  verbs  or  adverbs  are  broken 
down  into  upper  and  lower  limits  on  various  state  features,  defining  a  range  in  which  we 
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prefer  the  vehicle  to  operate.  For  these,  we  assume  that  the  user  would  prefer  to  be 
toward  the  center  of  the  range;  that,  as  in  fuzzy  logic,  the  “ideal”  expression  of  the 
desired  behavior  is  not  near  the  limits  of  pennissibility,  but  toward  the  center.  We  define 
the  margin  of  success  for  S°  as: 


margin‘s  . 


[  absf  F‘  -  midpoint  ( S  °.))/  half-  range(  S  ),  F.  °  S  “. 
{  undefined,  F/-,  oS" 


(4.10) 


where  F'j  is  again  the  /th  element  of  the  z'th  feature  vector  computed  by  FEXT  from  the 
trajectory,  midpoint (S(\j)  returns  the  center  of  the  soft  limit  range  that  corresponds  to  the 
feature  under  consideration  and  half-range^ f  returns  the  distance  from  the  midpoint  to 
either  limit.  When  an  individual  margins  =  0,  the  feature  perfectly  matches  the  desired 
feature  value.  When  margins  =  1,  the  feature  is  exactly  on  one  of  the  range  limits.  So  in 
these  cases,  smaller  margin  magnitudes  are  desired.  Negative  values  indicate  a  feature 
that  was  less  than  midpoint(S  j),  while  positive  values  indicate  the  feature  exceeded 
midpointfS  j).  (Note  that  soft  upper  or  lower  limits  can  also  be  treated  by  the  system; 
however,  we  had  none  in  our  test  suite.  Their  margins  would  be  computed  in  the  same 
fashion  as  margin n).  Figure  30  shows  the  histogram  for  soft  limit  success  margins. 
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Figure  30:  Margin  of  success  for  soft  limits  in  2DOF  case 


There  are  no  strong  trends  in  this  data  for  either  the  INIT'  or  the  Default "  case.  S° 
are  met  without  necessarily  being  driven  entirely  to  the  center  of  the  acceptable  range. 
Given  our  satificing  approach,  this  is  not  surprising.  Neither  INIT ’  nor  Default n 
noticeably  outperformed  the  other  in  finding  more  solutions  closer  to  midpoint.  This 
shows  the  robustness  of  WADJ,  which  (eventually)  achieves  similar  results  regardless  of 
starting  point. 

Unlike  the  hard  limits  case,  there  were  enough  S°  which  went  unmet  to  warrant  a 
comparison  of  margins  of  failure  for  soft  constraints.  We  define  the  margin  of  failure  to 
be: 


margin' failj 


\  (F)  -  limit(S)  ))  /half-  range(S°j ),  F.-,  °  S° 
;  undefined,  F‘  oS “ 


(4.11) 
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where  limit( S\)  is  the  upper  or  lower  limit  of  the  /th  element  in  S°.  The  sign  of  the 
margin  tells  us  how  it  failed:  negative  marginfau  indicate  that  a  lower  limit  was  failed, 
whereas  positive  marginfau  means  that  an  upper  limit  was  exceeded.  In  either  case, 
smaller  magnitude  values  are  better,  since  it  means  that  the  limits  were  failed  by  a  smaller 
amount,  half-range (S°)  is  used  as  the  scaling  factor  rather  than  the  numeric  value  of  the 
limits  or  midpoint  on  the  assumption  that  a  narrower  range  is  more  sensitive  to  errors  of  a 
given  size  than  a  larger  range.  That  is,  if  the  range  is  100  units  wide,  an  error  of  one  unit 
is  of  less  import  than  if  the  range  were  only  2  units  wide,  regardless  of  the  numeric  values 
of  the  ranges’  endpoints.  Figure  3 1  shows  the  histogram  for  margins  of  failure  on  S°. 
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Figure  31:  Margins  of  failure  on  soft  limits  for  2DOF  case 


The  results  are  again  very  similar  for  INIT 1  and  Default'1.  Default'1  has  four  more 
solutions  near  the  center,  indicating  smaller  errors;  however,  it  also  has  two  more 
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solutions  on  the  tails  of  the  histogram.  The  standard  deviation  in  the  Figure  31  is  2.41  for 
Default 11  and  2.15  for  INIT1 .  There  does  not  seem  to  be  a  clear  advantage  to  either 
technique. 

On  the  other  hand,  if  INIT  is  used  to  support  an  anytime  planning  algorithm,  there 
is  a  definite  benefit  to  using  it.  In  addition  to  meeting  more  of  the  L°,  INIT  meets  them 
with  better  margins.  Figure  32  shows  the  histogram  of  margitifaii  for  the  returned 
trajectories  after  one  iteration  of  TPLAN.  Results  are  similar  for  failures  to  meet  H°. 


□  Defaultl 

□  INIT  1 


't(0C0OC0(0^CM^t0(0^CMC0(0(0'tCM 

c\i  d  d  ddc\ioo  ^fLridi^  co  d  d  d 

1  1  1  T—  T— 

marginfaN 

Figure  32:  Margins  of  failure  for  soft  limits  after  one  iteration  in  2DOF  case 

INIT1  shows  four  more  failures  with  the  very  smallest  margin/au.  While  Default1  has 
more  failures  in  the  general  central  region,  this  may  be  because  Default1  simply  had  more 
failures  overall.  Default1  also  has  four  more  failures  in  the  high-error  tail  region.  INIT1 
certainly  has  superior  values  for  margins  when  compared  to  Default  ,  as  shown  in  Figure 
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33.  INIT1  not  only  has  more  successes,  it  has  many  more  successes  in  the  low-valued 
region  nearest  the  goal  midpoint. 


Figure  33:  Margins  of  success  for  soft  limits  after  one  iteration  in  2DOF  case 


4.5  Conclusions 

We  are  overall  satisfied  with  the  performance  of  our  architecture  and  the  INIT  and 
WADJ  routines  in  this  2DOF  case.  INIT,  while  not  greatly  impacting  final  solution 
quality,  did  shorten  the  time  needed  to  arrive  at  an  acceptable  solution  and  did  improve 
the  quality  of  early  solutions  -  a  boon  if  anytime  planning  is  to  be  used.  The  WADJ 
heuristics  we  developed  reliably  moved  the  solution  into  areas  that  satisfied  hard 
constraints  H°  100%  of  the  time,  regardless  of  starting  condition.  The  sometimes- 
competing  S°  were  not  always  satisfied;  however,  definite  improvement  over  the  initial 
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solutions  achieved  using  either  INIT  or  default  weights  was  seen  in  the  solution  as  WADJ 
refined  the  trajectory. 

Some  preliminary  work  has  been  done  into  extending  this  2DOF  work  into  the 
multi-agent  formation  maintenance  field.  A  term  can  be  added  to  the  cost  functional 
which  penalizes  the  variation  from  the  desired  fonnation  and  weighted  along  with  the 
rest.  Proof-of-concept  work  was  done  in  the  linear  2DOF  case,  with  two  vehicles 
maneuvering  in  the  presence  of  obstacles  (Figure  34).  High  relative  weights  placed  on 
formation  maintenance  resulted  in  both  vehicles  moving  around  the  obstacle.  High 
relative  weights  for  time  resulted  in  trajectories  where  the  vehicles  parted  ways  to  move 
around  the  obstacle.  Energy-efficient  trajectories  had  a  minimum  disturbance  from 
straight-line  trajectories  -  the  vehicles  parted  ways  to  circumvent  the  obstacle,  but  with 
smaller  margins  than  they  had  in  the  “quickly”  case.  However,  no  rigorous  collection  of 
multi-agent  WADJ  data  has  been  performed  to-date,  although  such  analysis  could  be 
applied  to  formations  maneuvering  in  the  presence  of  obstacles  in  future  work. 
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All  considerations  equal 


Conserve  fuel 


Average  error  in 
formation:  98.5% 

Fuel:  139.6  units 


Average  error  in 
formation:  68.9% 

Fuel:  117.0  units 


Maintain  formation 

Average  error  in 
formation:  6.2% 

Fuel:  149.9  units 


Figure  34:  Preliminary  work  in  multi-vehicle  formation  management 


It  now  remains  to  show  how  well  -  or  even  if  -  these  techniques  translate  to  a 
more  challenging  six  degree-of-freedom  domain.  Chapter  5  will  consider  the  a  simplified 
version  of  the  space  shuttle/ISS  docking  problem  first  solved  with  hand-tuned  weights  in 
[51].  We  will  attempt  to  find  what  different  solutions  can  be  found  when  other 
constraints  are  placed  on  the  problem  as  well  as  further  evaluating  the  performance  of  our 
system  for  a  6-DOF  domain  with  nonlinear  dynamics. 
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5  Six  Degree  of  Freedom  Deep  Space  Satellite 


The  experiments  described  in  Chapter  4  showed  our  architecture  could  be  useful. 
But  the  2DOF  point  rover  problem  was  highly  simplified  and  linear.  Would  the 
techniques  developed,  the  WADJ  rules  in  particular,  be  useful  in  a  nonlinear  space 
domain  as  well? 

We  began  research  into  reproducing  the  International  Space  Station/Space  Shuttle 
docking  simulation  presented  in  [51].  Unfortunately,  the  implementation  from  [51]  was 
sufficiently  complicated  that  the  optimal  controls-based  trajectory  planner  could  only 
reliably  identify  solutions  via  substantial  tuning  of  additional  parameters  beyond  cost 
function  weights  for  each  iteration.  Changing  the  cost  functional  weights  changed  the 
character  of  the  problem  sufficiently  that  the  gain  schedules  used  in  [5 1]  to  incrementally 
refine  thruster  approximations,  error  tolerances,  and  obstacle  penalty  function  values  no 
longer  guided  the  solver  to  convergence  in  many  cases.  Re-tuning  those  gain  schedules 
for  every  new  weighted  cost  functional  was  simply  beyond  the  scope  of  the  current 
research.  Unable  to  generate  trajectories  for  analysis  and  adjustment,  we  were  forced  to 
abandon  this  problem  for  a  somewhat  simpler  three-dimensional,  six  degree  of  freedom 
(space)  domain  example  with  dynamic  properties  that  were  nonlinear  but  less  complex 
than  orbital  motion. 

We  chose  to  examine  a  six  degree  of  freedom  satellite  operating  in  deep  space, 
away  from  gravitational  fields  but  potentially  in  proximity  to  obstacles  (e.g.,  asteroids). 
While  this  does  result  in  very  simple  and  linear  translational  dynamics,  the  rotational 
dynamics  are  nonlinear  and  make  the  problem  more  interesting  than  the  simple  2DOF 
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rover.  We  assume  impulse  thrusters  for  translation  and  reaction  wheels  for  torque 
generation,  following  the  modeling  described  in  [51]. 


5.1  System  Dynamics 

The  general  state  space  form  of  a  rigid  body  in  deep  space  is  given  by: 
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u(t)/m 
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(5.1) 


where: 

p  is  the  location  (position)  vector  of  the  body  in  the  inertial  reference  frame 
v  is  its  velocity  vector 

to  is  the  rotational  velocity  vector  expressed  in  the  body  frame 
a  is  a  representation  of  the  body’s  attitude  (a  modified  Rodrigues  vector  [55]) 
H  is  a  matrix  of  moments  of  inertia 
S  is  the  matrix  representation  of  the  cross  product 

G„  is  an  expression  which,  when  multiplied  by  co,  gives  the  rate  of  change  in  a 


a  =  GJako  =  | 


\-S(o)  +  aor  - 


1  +  gtg  '' 


CO 


(5.2) 


R  is  a  rotation  matrix  that  converts  body  coordinates  to  inertial  coordinates 
u  is  the  vector  of  translational  control  input  (force)  expressed  in  the  body  frame 
m  is  the  mass  of  the  rigid  body 
r  is  the  vector  of  rotational  control  input  (torque) 
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As  in  [51],  the  mass  used  was  1  kg  and  Ixx  =  Iyv  =  IZZ  =  1  N-m/s2.  Maximum  thruster 
output  was  ±30,000  N  in  each  axis.  Maximum  torque  output  about  each  axis  was  smooth 
until  a  saturation  value  of  ±1,000  N-m. 

5.2  Terms  of  the  Cost  Functional 

We  have  the  same  concerns  in  the  6DOF  case  as  we  do  in  the  2DOF:  we  wish  to 
conserve  our  fuel  and  power  inputs,  accomplish  our  goals  in  a  timely  fashion,  and  not 
impact  obstacles.  Our  inputs  are  different  in  this  case;  rather  than  a  continuous 
electrically-powered  motor,  we  have  saturating  thrusters  for  3DOF  translation  and  an 
electrically-powered  reaction  wheel  for  3DOF  attitude  control.  As  a  result,  the  cost 
functional  has  the  form: 

J  =  jkM  +  r(t)TW2T(t)  +  W2  +W4  max(o;.(r,.))ldt  (5.3) 

t,,  \  i^{0}  J 

Each  of  these  tenns,  and  their  rationale,  is  explained  in  detail  in  [5 1].  Brief  descriptions 
are  given  below. 

5.2.1  Thruster  Fuel 

The  one-norm  of  the  thruster  force  results  in  a  minimum-fuel  control  law.  This 
control  law  is,  however,  discontinuous  and  so  violates  the  assumptions  that  underlie  the 
numeric  solution  of  the  Euler-Lagrange  equations.  The  solution  in  [5 1]  was  to 
approximate  the  discontinuous  control  law,  beginning  with  an  approximation  with 
moderate  slopes  and  gradually  increasing  the  slope  to  nearer  a  step  function. 
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5.2.2  Electrical  Energy 

Still  following  [51],  the  rotational  actuators  are  assumed  to  be  powered  by 
electricity.  This  is  the  case  on  long-duration  missions,  and  has  the  benefit  of  separating 
translational  and  rotational  control  parameters.  This  form  of  the  cost  functional  is  an 
energy-minimizing  term,  a  standard  cost  representation  for  electrically-powered  systems. 

5.2.3  Time 

Since  J  is  an  integral,  the  cost  functional  only  needs  a  constant  tenn,  Wj,  to 
minimize  time.  Over  the  integral,  Wj*tf  will  be  minimized. 

5.2.4  Clearance  to  and  Speed  Near  Obstacles 

As  in  Chapter  4,  the  obstacle  penalty  function  contains  a  cubic  spline  term  that 
penalizes  nearness  to  the  obstacles.  This  function  also  includes  a  velocity-based 
component  that  penalizes  speed  near  the  obstacle.  Since  the  cost  functional  is  an  integral 
over  time,  a  penalty  based  purely  on  clearance  to  the  obstacle  can  be  minimized  by  being 
very  close  to  the  obstacle  but  going  very  fast,  so  that  the  sum  over  time  is  less.  The 
clearance  penalty  is  now  multiplied  by  a  smoothed  one-norm  of  the  velocity  (at  some 
epsilon  near  zero,  the  one-nonn  is  approximated  by  a  cubic  to  maintain  smoothness 
properties  necessary  for  convergence).  As  in  the  2DOF  case,  the  obstacle  clearance 
penalty  goes  to  zero  at  some  LIM  from  the  obstacle;  also  as  in  the  2DOF  case,  the 
penalties  for  each  obstacle  in  the  domain  are  summed. 
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5.3  Development  of  Weight  Adjustment  Heuristics  and  Fuzzy  Rules 

5.3.1  Weight  Adjustment  Heuristics 

For  the  6DOF  weights,  the  TPLAN  code  BVP4C2  assumed  that  the  force  weight 
Wj  was  normalized  to  1,  and  all  other  weights  were  relative  to  this.  As  a  result,  it  was 
more  intuitive  to  work  with  the  Wj  as  the  denominator  in  the  weight  ratios  for  our  6DOF 
spacecraft  domain.  All  of  the  code  written  to  implement  these  6DOF  heuristics  uses 
torque/force  and  time/force  weights,  rather  than  the  inverse  as  in  the  2DOF  case. 
However,  for  some  select  charts  and  examples  presented  in  this  chapter,  the  weight  ratios 
were  inverted  for  easier  comparison  to  the  Chapter  4  results.  Not  all  of  the  WADJ 
heuristic  graphs  nor  the  fuzzy  rules  sets  contained  in  here  and  in  Appendix  C  reflect  this 
inversion  and  are  labeled  W2/W1  and  Wj/Wj  as  they  were  implemented.  Our  weight 
vector  Q  included  Wj  the  force  weight,  W 2  the  torque  weight,  W 3  the  time  weight,  and 
LIM.  (Since  the  obstacle  penalty  function  weight  W4  is  never  adjusted,  we  do  not  include 
it  in  our  weight  vector  representation.) 

To  develop  WADJ  rules,  we  followed  the  general  procedure  outlined  in  Chapter  3 
again  in  the  6DOF  domain.  We  did  not  test  different  field  sizes  this  time,  as  we  had 
confidence  from  the  2DOF  results  that  they  would  scale  well.  (This  confidence  was  well- 
placed;  our  WADJ  curves,  below,  were  generated  at  a  scale  of  50  m  while  our  test  cases 
were  on  the  order  of  10  -  20  m.)  We  tested  pure  translation  (along  one  axis  and  along 
three),  translation  plus  rotation,  and  a  translation  in  the  presence  of  obstacles.  For  the 
translation  in  the  presence  of  obstacles,  the  final  state  orientation  was  identical  to  the 
initial  state  orientation;  rotation  was  not  required,  but  it  was  not  forbidden,  either. 
Following  the  insights  gained  in  the  2DOF  domain,  we  plotted  “per  second”  features 
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versus  the  ratio  of  the  weight  of  the  translational  inputs  (here,  thruster  force)  Wi  to  the 
time  weight  W 3.  The  results  for  the  feature  avg-speed  are  shown  in  Figure  35,  below. 
Once  again,  there  is  the  power  relationship  between  speed  and  the  force/time  weight 
ratio.  We  found  this  to  be  the  case  for  the  other  force  and  time-based  quantities  as  well. 


W1/W3 


Figure  35:  WADJ  curve  for  avg-speed  in  6DOF  domain 
The  path-based  features  (e.g.,  min-sep)  were  once  again  linear  with  the  influence 
limit  of  the  obstacle  penalty  function.  Unlike  the  2DOF  case,  the  trajectories  were  much 
more  likely  to  be  plotted  through  obstacles.  To  handle  this,  we  added  an  implicit  hard 
constraint  to  every  trajectory,  min-sep  >  0  m.  If  the  path  went  inside  an  obstacle,  the 
trajectory  failed  and  LIM  was  adjusted  to  move  the  path  out  of  the  obstacle.  This  solved 
that  problem. 

Torque  presented  us  with  a  challenge.  Our  WADJ  test  for  torque  included  a 
translation  and  a  rotation,  so  that  we  would  see  the  effects  of  dynamic  coupling. 
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Following  the  intuition  we  had  from  the  translational  features,  we  tried  plotting  the  ratio 
of  the  torque  weight,  W2,  versus  the  time  weight  W$.  However,  since  torque  and  the 
resulting  rotational  motions  are  the  source  of  nonlinearity  in  the  system,  initial  results 
indicated  no  power  law  for  WADJ  thus  we  at  first  were  concerned  this  heuristic  may  not 
be  applicable. 

Upon  further  examination,  however,  we  identified  a  more  promising  heuristic. 
Figure  36  shows  the  torque  data  grouped  by  time  and  force.  First,  the  data  were  grouped 
by  their  torque/force  weight  ratio  (W2/W1)  but  plotted  versus  the  time/force  ratio  ( W3/W1 ) 
as  shown  in  Figure  36.  Each  line  in  Figure  36  represents  a  fixed  W2/W1  ratio.  Even 
though  that  ratio  is  fixed,  the  amount  of  torque  applied  can  be  increased  or  decreased  by 
adjusting  the  Wj/Wj  ratio.  Conversely,  if  the  W3/W1  were  known  and  fixed,  changing  the 
W2/W1  ratio  could  jump  the  torque  up  or  down  that  family  of  linear  curves.  Was  there  a 
predictable  relationship  between  the  slopes  of  the  lines  in  Figure  36  and  W2/W1  ? 
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7 


Figure  36:  First  stage  of  WADJ  heuristic  for  determining  torque  in  6DOF  domain 


Figure  37  shows  that  there  was.  Our  torque  heuristic  was  implemented  as  follows:  First, 
all  non-torque  features  were  checked  for  limit  failures  and,  if  there  were  failures,  the 
weights  were  adjusted.  Then  the  torque  feature  was  checked.  If  it  failed,  the  desired 
torque  value  was  divided  by  the  current  Wj/Wj  value  to  get  the  slope  of  the  line  we  would 
like  to  be  traverse  in  Figure  36.  Then  the  power  relationship  shown  in  Figure  37  was 
used  to  calculate  the  necessary  W2/W1  ratio  from  the  desired  slope. 
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Figure  37:  Second  stage  in  torque  heuristic  in  6DOF 

5.3.2  Fuzzy  Rules 

The  fuzzy  rules  were  generated  as  they  had  been  for  the  2DOF  case.  The  WADJ 
data  was  reverse  engineered  so  that  “very  high,”  “high,”  etc.,  weight  values  were 
correlated  back  to  the  trajectory  features  they  elicited.  We  again  found  that  the  range 
from  2'  ,  2'",  . . .,  2  ,  2  was  sufficient  to  describe  the  WADJ  relationship.  The  full  set  of 
fuzzy  rules  for  the  6DOF  spacecraft  domain  is  again  found  in  Appendix  C. 

5.4  Results 

Six  different  sets  of  constraints  L°  were  enforced  on  four  different  obstacle  fields 
{ O }  for  a  total  of  twenty-four  trials.  However,  one  constraint/obstacle  pairing  proved  to 
be  intractable  and  BVP4C2  could  not  converge  on  a  solution.  This  trial  (Constraint  Set  4, 
Obstacle  Set  4)  is  omitted  from  the  following  results.  There  were,  overall,  20  hard  limits 
(H°)  and  60  soft  limits  (S°).  The  simplest  constraint  set  enforced  one  soft  constraint;  the 
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most  extensive  had  two  hard  constraints  and  six  soft  constraints.  Appendix  B  fully 
details  the  test  cases. 

Each  of  the  twenty-three  successful  test  cases  was  run  from  a  default  weight 
vector  Q1  default  =  [1,  1,  1,  1]  and  from  a  Q1  provided  by  /ATT.  Again,  only  the  collocation 
TPLAN  BVP4C2  was  used  for  trajectory  generation.  The  results  after  one  iteration 
(labeled  Default1  and  INIT1)  and  after  program  completion  ( Default "  and  IN  IT’)  were 
examined  for  both  starting  weight  vectors. 

Before  considering  the  overall  results,  we  again  present  particular  solutions  to  two 
example  cases  for  illustrative  purposes.  Both  are  for  Constraint  Set  1  and  Obstacle  Set  1 . 
We  first  show  the  results  obtained  when  starting  from  the  default  weight  vector,  then 
show  results  when  the  weight  vector  is  initialized  via  our  fuzzy  logic  rules. 

5.4.1  Example  Cases 

Constraint  Set  1  in  6DOF  is  analogous  to  Constraint  Set  1  in  2DOF.  The  soft 
constraint  “somewhat  quickly”  is  requested,  while  a  hard  constraint  H°=  {max-speed  < 
5.5  m/s}  is  also  included  to  force  the  solution  toward  the  low  end  of  the  range  of  max- 
speed  encompassed  by  “quickly.”  “Quickly”  is  still  defined  as  high  max-speed  and  high 
avg-speed,  but  the  values  that  define  the  fuzzy  ranges  of  “high”  have  of  course  changed 
for  the  new  domain.  We  required  that  max-speed  be  between  5.0  and  10.6  m/s,  and  avg- 
speed  be  between  4.0  and  8.6  m/s. 

We  look  first  at  the  default  case,  where  we  started  with  all  weights  in  Q1  equal 
and  normalized  to  1  and  the  obstacle  influence  limit  LIM=  1  as  well.  The  results  are 
shown  in  Figure  38a-c,  below.  The  force  response  (shown  in  body  coordinates)  was 
typical  for  a  time-efficient  trajectory,  accelerating  to  the  midpoint  and  then  decelerating. 
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max-speed  was  2.33  m/s,  well  under  the  limit  imposed  by  H°,  but  the  S°  limits  were 
failed,  max-speed  was  2.66  m/s  too  slow  for  “quickly,”  and  avg-speed  was  -2.10  m/s  too 
slow.  The  weight  on  time  was  increased  to  4.57,  up  from  its  prior  value  of  1.0,  so  that 
Q2  =  [1.00,  1.00,  4.57,  1.00],  where  Wi  is  the  fuel  weight,  W2  is  the  torque  weight,  W3  is 
the  time  weight,  and  W 4  is  LIM.  In  this  domain,  the  weights  were  nonnalized  by  IT/. 
Increasing  W3  indicated  that  time  was  of  more  value  to  us,  so  the  next  trajectory  should 
take  less  time  and  hence  be  faster. 

The  torques  seen  in  the  Q1  are  a  result  of  the  use  of  the  (approximated)  one-norm 
in  the  cost  functional  fuel  term.  Consider  a  planar  robot  able  to  move  in  x,  y,  and  6.  To 
translate  at  a  45°  angle  to  the  x-axis  with  1  N  of  force,  it  could  fire  the  x  and  y  thrusters 
each  at  1.4142  N,  for  a  total  expenditure  of  2.8284  N  as  computed  by  the  one-nonn.  Or, 
it  could  rotate  45°  so  that  its  x  thruster  was  aligned  with  the  direction  of  motion,  then 
thrust  with  1  N.  As  long  as  the  cost  of  the  torque  maneuver  is  less  than  the  force  saved, 
this  is  more  efficient. 

We  see  such  fuel-minimizing  torques  in  Constraint  Sets  1-3;  the  numeric  solver 
did  not  identify  cost  savings  via  torque  maneuvers  in  Constraint  Sets  4  and  5.  To  validate 
the  use  of  torque  commands  to  minimize  fuel  usage,  Table  2  summaries  the  one-nonns  of 
the  forces  for  Constraint  Sets  1-3  represented  in  the  inertial  frame,  representing 
initial/default  spacecraft  thruster  alignment,  and  in  the  body  frame,  representing  actual 
thruster  alignment  when  the  designated  torques  are  applied.  The  forces  in  the  body  frame 
are  always  less  than  or  equal  to  the  inertial  forces,  demonstrating  that  by  rotating  the 
vehicle,  less  force  needed  to  be  applied.  The  effects  are  very  small,  however,  thus  it  is 
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not  surprising  that  some  cases  did  not  alter  spacecraft  attitude  (Constraint  Sets  4  and  5), 


especially  given  a  small  but  non-zero  penalty  for  the  energy  (torque)  term. 


Constraint  Set 

Obstacle  Set 

One-Norm  of  Inertial 

Forces  (N) 

One-Norm  of  Body  Forces 

(N) 

1 

1 

17.7316 

17.7311 

1 

2 

18.9900 

18.9886 

1 

3 

18.5617 

18.4592 

1 

4 

13.6732 

13.4170 

2 

1 

4.5407 

4.5407 

2 

2 

5.3882 

5.3877 

2 

3 

5.4126 

5.0528 

2 

4 

3.8322 

3.8146 

3 

1 

17.7316 

17.7311 

3 

2 

20.1985 

20.1971 

3 

3 

24.2088 

24.0482 

3 

4 

14.3041 

14.0320 

Table  2:  One-Norm  of  Inertial  and  Body  Forces 
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(a) 


Velocities  and  Rotations  for  Default  CS  1,  OS  1,  Iter  1 


Force  and  Torque  for  Default  CS  1,  OS  1,  Iter  1 


(b) 


(c) 


Figure  38:  6DOF  a)  path,  b)  rates  and  c)  inputs  for  Q1  =  [1.00, 1.00, 1.00, 1.00] 
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Figure  39  shows  the  results  for  Q2 .  The  overall  completion  time  was  almost 
halved,  with  the  speeds  increasing  accordingly.  Indeed,  the  shape  of  the  path,  the  rates, 
and  the  input  curves  are  almost  unchanged  except  in  magnitude,  max-speed  was  4.92  m/s 
for  this  run,  still  meeting  H°,  and  still  failing  the  max-speed  component  of  S°  -  but  only 
by  0.08  m/s  this  time,  avg-speed  was  4.19  m/s,  within  its  S°  requirements.  We  were 
converging  toward  a  full  solution,  but  not  quite  there.  The  time  weight  was  increased  a 
small  amount,  to  4.72,  and  another  trajectory  found. 

Q3  produced  the  same  trajectory  as  Q2.  The  returned  features  were  the  same;  the 
max-speed  failed  its  S°  by  the  same  amount.  The  plots  of  path,  rates  and  inputs  were  the 
same.  The  change  in  weight  was  not  sufficient  to  elicit  a  different  response  in  the  system 
at  our  levels  of  precision.  WADJ  increased  the  time  weight  again,  applying  the  0.08  m/s 
failure  margin  to  the  base  time  weight  of  4.72  from  Q3 .  The  result  was 
Q4=  [1.00,  1.00,4.86,  1.00]. 
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(a) 


Velocities  and  Rotations  for  Default  CS  1,  OS  1,  Iter  2  Force  and  Torque  for  Default  CS  1,  OS  1,  Iter  2 


(b)  (c) 

Figure  39:  6DOF  a)  path,  b)  rates,  and  c)  input  for  £22  =  [1.00,  1.00,  4.57, 1.00] 
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This  change  was  successful.  Figure  40  shows  the  results  for  Q4.  Again,  the  path 
and  the  general  response  of  the  system  was  similar  to  all  other  solutions.  But  this  time, 
the  feature  values  met  both  the  H°  and  S°  requirements,  max-speed  was  5.12  m/s,  just 
over  the  5.0  m/s  low  end  of  “quickly,”  and  still  below  the  H°  limit  of  5.5  m/s.  avg-speed 
was  4.26  m/s,  also  still  within  “quickly.”  Here  we  see  a  case  of  convergence  without 
overshoot  or  cycling.  Most  of  the  benefit  was  gained  in  the  first  iteration  of  WADJ,  when 
the  time  weight  was  changed  from  1.00  to  4.57.  Had  we  stopped  there,  we  would  have 
had  only  a  partial  success,  but  one  with  a  very  small  soft  constraint  failure  margin. 
Continuing,  we  refined  the  answer  further  until  all  constraints  were  met  and  a  fully 
acceptable  solution  was  found. 

In  each  of  these  figures,  we  noted  an  apparent  discrepancy  between  the  applied 
forces  and  the  resulting  velocities.  The  forces  appeared  to  be  linear,  but  the  velocities  did 
not  look  quadratic.  We  plot  the  velocity  gradient  in  the  inertial  x-direction  along  with  the 
applied  forces  in  both  the  body  x-direction  and  inertial  x-direction  in  Figure  4 1 .  Note 
first  that  the  difference  between  the  body  and  inertial  force  is  negligible.  Although  the 
numeric  solver  found  some  small  force  savings  in  this  case  by  torquing  the  vehicle,  the 
savings  occurred  well  below  our  levels  of  precision  in  most  cases.  Figure  41  also  shows 
exact  agreement  between  the  slope  of  the  x-velocity  curve  and  the  x-input,  as  expected. 
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Path  for  Default  CS  1,  OS  1,  Iter  4 


(a) 


Velocities  and  Rotations  for  Default  CS  1,  OS  1,  Iter  4 
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Force  and  Torque  for  Default  CS  1,  OS  1,  Iter  4 
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(b) 


(c) 


Figure  40:  6DOF  a)  path,  b)  rates  and  c)  input  for  Q4  =  [  1.00,  1.00,  4.86,  1.00] 
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Figure  41:  Velocity  gradient  compared  to  body  and  inertial  forces 
We  continue  with  a  shorter  example:  the  same  constraint  and  obstacle  set,  but 
starting  with  an  initialized  weight  vector.  INIT  returns  for  this  Q1  =  [1.00,  1.00,  4.00, 

1 .00].  Already,  we  see  that  this  is  much  closer  to  the  Q2  of  the  default  case.  We 
correspondingly  expect  that  the  first  returned  trajectory  will  be  closer  to  our  expressed 
requirements  and  preferences. 

Figure  42  shows  that  our  expectations  were  met.  Still  similar  in  form  to  all  the 
other  solutions,  max-speed  was  4.57  m/s  and  avg-speed  was  3.90  m/s.  H°  was  met,  but 
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both  S°  constraints  failed.  The  magnitudes  of  the  failures  were  much  smaller  than  after 
the  first  iteration  of  the  default  case,  however:  only  0.43  m/s  for  max-speed  and  0.10  m/s 
for  avg-speed.  The  time  weight  was,  as  before,  increased  by  WAD J  to  elicit  a  faster 
response;  Q  =[1.00,1.00,4.79,1.00],  This  was  between  the  unsuccessful  Q  and 
successful  Q4  of  the  default  case,  but  is  definitely  in  the  area  of  a  known  successful 
weight  vector.  Q2  succeeded  in  meeting  all  constraints.  The  resulting  trajectory,  shown 
in  Figure  43,  appears  to  be  the  same  as  was  found  for  the  default  weight  vector  after  four 
iterations.  The  max-speed  and  avg-speed  were  the  same,  5.12  m/s  and  4.26  m/s. 

Here,  then,  was  shown  the  main  advantage  of  INIT  and  the  strength  of  WADJ. 
INIT  saved  two  iterations  in  this  example  because  we  were  placed  in  the  general  area  of  a 
successful  weight  vector  on  the  first  try.  When  trajectory  calculations  can  take  from 
twenty  minutes  to  several  hours,  this  is  no  small  savings.  However,  regardless  of  starting 
point,  the  WADJ  heuristics  got  us  to  near-identical  solutions.  (Actually,  as  the  z-axis 
torque  results  are  so  very  far  below  our  levels  of  precision  that  they  could  be  counted  as 
identically  zero,  we  could  say  that  they  are  identical).  Further,  we  saw  that  the  first  use 
of  WADJ  in  the  default  case  resulted  in  large  improvements  in  solution  quality  in  a  single 
step. 

Such  success  with  INIT  will  not  always  be  observed;  in  cases  of  competing 
constraints,  as  seen  in  Chapter  4,  the  weights  can  oscillate  between  those  satisfying  one 
constraint  and  those  satisfying  the  other.  Either  the  oscillation  magnitude  will  decrease 
until  eventually  settling,  or  else  a  cycle  will  be  detected  by  EVAL,  upon  which  case  the 
process  will  tenninate.  In  either  of  those  cases,  solution  quality  may  still  improve, 
although  not  as  consistently  as  in  these  examples. 
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(b) 


(c) 


Figure  42:  6DOF  a)  path,  b)  rates,  and  c)  inputs  for  Q1  =  [1.00, 1.00,  4.00, 1.0] 
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Path  for  Intiialized  CS  1 ,  OS  1 ,  Iter  2 


(a) 


Velocities  and  Rotations  for  Initialized  CS  1,  OS  1,  Iter  2 


Force  and  Torque  for  Initialized  CS  1,  OS  1,  Iter  2 


1  1.5  2  2.5  3  3.5  4  4.5 


(b) 


(c) 


Figure  43:  6DOF  a)  path,  b)  rates,  and  c)  inputs  for  Si '  =  [1.00, 1.00,  4.79,  1.00] 
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5.4.2  Overall  6DOF  Results 


Figure  44  shows  the  total  number  of  failures  for  each  of  our  solution  cases.  As  in 
Chapter  4,  these  are  INIT1 ,  the  solution  generated  using  the  weights  suggested  by  I  NIT, 
INIT',  the  solution  generated  by  running  the  INIT1  solution  to  conclusion  through  the 
architecture,  Default1 ,  the  solution  generated  using  a  default  weight  vector  with  all 
weights  equal  and  LIM=  1,  and  Default",  the  Default1  solution  run  to  completion. 


Case 


□  Soft  limit  Failures 

■  Hard  limit  Failures 

_ 


Figure  44:  Failures  for  each  solution  case  out  of  20  H°  and  60  S° 

The  results  for  soft  limit  failures  S°  are  similar  to  the  2DOF  case.  The  INIT  procedure 

results  in  noticeably  fewer  soft  constraint  failures.  There  are  more  hard  limit  H°  failures 
with  INIT  due  to  our  examples  with  competing  constraints,  but  again  no  hard  limit 
failures  were  present  in  the  final  solutions.  A  closer  look  at  the  data  in  Figure  45  shows 
that  the  differences  from  2D  results  are  not  quite  as  compelling  as  might  appear  at  first 
glance. 
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□  INIT1 
■  Default  1 
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Figure  45:  Margins  of  failure  for  hard  limits  after  first  trajectory  generation  for  6DOF  cases 


Although  Eq.  4.6  is  presented  as  a  margin  of  success  for  a  hard  limit,  we  use  the  same 
equation  to  calculate  the  margins  of  failure  here.  Most  of  the  INIT1  H°  failures,  after 
normalization,  fall  between  -1  and  1.  That  is,  they  are  relatively  small,  and  the  solution 
was  actually  fairly  close  to  meeting  the  hard  limits.  But  it  did  not,  so  the  question  is 
raised:  was  the  INIT  routine  helpful  in  this  case?  Total  limit  failures  after  one  trajectory 
was  computed  were  54  with  INIT1  and  57  with  Default1  -  not  a  significant  difference. 
After  completion,  both  INIT'  and  Default"  were  again  without  H°  failures,  and  the 
difference  in  S°  failures  was  also  small  (3 1  for  INIT'  and  33  for  Default").  The  average 
number  of  iterations  required  for  completion  was  5.1  for  INIT'  and  5.3  for  Default",  so 
there  was  not  even  the  time  savings  that  was  seen  for  the  2DOF  case.  Although,  as  in  the 
2DOF  case,  INIT  solved  five  problems  on  the  first  try,  which  the  default  never  did,  and  it 
had  a  single  case  with  a  large  number  of  iterations  (24)  that  was  greater  than  the  longest 
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default  run  (11  iterations).  But  an  examination  of  that  24-iteration  run  (Constraint  Set  1, 
Obstacle  Set  4)  shows  that,  after  2  iterations,  all  the  limits  were  met  within  our  level  of 
precision.  However,  the  hard  limit  on  max-speed  was  still  exceeded  by  an  amount 
smaller  than  that.  Since  EVAL  did  not  round  the  failure  margins  to  our  level  of  precision, 
this  was  viewed  as  a  failure  and  the  architecture  kept  trying  to  find  a  solution  until  the 
weights  looped.  INIT  had  a  much-reduced  impact  on  the  6DOF  cases,  neither 
significantly  helping  nor  hurting  the  results.  Figure  46  shows  a  frequency  histogram  for 
the  number  of  iterations  required  (Constraint  Set  1,  Obstacle  Set  4  is  omitted  for  INIT1). 
The  advantage  of  INIT  is  clearer  here;  a  majority  of  the  runs  started  with  INIT  finish  in 
one  or  two  runs,  while  those  starting  from  default  weights  need  three  runs  at  the  least. 


□  Defaultn 
■  INITn 


#  iterations 


Figure  46:  Number  of  iterations  through  architecture 
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Displaying  the  S°  failures  by  obstacle  set  shows  trends  similar  to  the  2DOF  case, 
although  the  median  failure  rate  is  higher.  Excluding  the  IN  IT1  case  in  Obstacle  Set  4  in 
Figure  27,  the  median  S°  failure  rate  by  obstacle  set  was  50%  for  the  completed  cases 
{Default"  and  IN  IT')  with  a  spread  of  about  6%.  Here,  neglecting  the  one  very  low 
failure  rate  case  in  Obstacle  Set  1  this  time,  we  have  a  range  from  54%  to  77%,  with  the 
median  at  65.5%  S°  failures  in  the  completed  cases.  The  range  of  failure  rates  is  broader, 
but  there  still  does  not  seem  to  be  an  obstacle-based  trend  in  solution  fitness. 
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Figure  47:  6DOF  S°  failures  by  obstacle  set 

Figure  48  shows  the  percentage  of  S°  failures  by  constraint  set.  Here  we  see 
definite  trends,  with  some  constraint  sets  being  apparently  simple  to  entirely  satisfy, 
while  others  had  100%  S°  failure  rates. 


□  Defaultl 
■  INIT1 

□  Defaultn 

□  INITn 
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Figure  48:  6DOF  S°  failures  by  constraint  set 


Constraint  Sets  2  and  3  had  small  numbers  of  noncompeting  soft  constraints  and 
no  hard  constraints.  Constraint  Set  2  had  a  soft  numeric  range  limit  on  thrust;  Constraint 
Set  3  was  “a  little  quickly,”  which  de fuzz i  fled  into  soft  constraints  on  max-speed  and 
avg-speed.  With  no  other  requirements,  these  constraints  were  solved  much  more 
successfully.  INIT'  solved  them  entirely  for  all  obstacle  cases;  Default "  had  small  errors 
with  Obstacle  Set  1  (in  combination  with  Constraint  Set  2)  and  Obstacle  Set  4  (in 
combination  with  Constraint  Set  3). 

Constraint  Set  1  was  very  similar  to  Constraint  Set  1  in  the  2DOF  case;  we  set  a 
hard  limit  on  max-speed  and  also  the  soft  preference  for  “somewhat  quickly.”  The  hard 
limit  was  toward  the  low  end  of  the  fuzzy  ranges  that  define  “quickly,”  forcing  the 
system  to  hit  a  small  window  of  feature  values  that  would  satisfy  both.  While  this 
constraint  set  gave  the  2DOF  case  some  problems,  here  it  was  entirely  successfully 
solved  in  all  cases  that  went  to  completion. 
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We  also  note  that  for  these  first  three  constraint  sets,  I  NIT1  returns  remarkably 
better  initial  solutions  than  Default1 .  By  happenstance,  the  default  weights  produce 
results  that  do  not  meet  any  of  the  S°,  while  INIT1  achieves  at  least  partial  success  in  that. 
So  we  see  here  one  practical  application  for  INIT:  in  those  cases  where  there  are  few  or 
noncompeting  constraints,  it  provides  an  excellent  initial  guess  compared  to  using  default 
weights. 

Constraint  Sets  4  and  5  were  clearly  less  successful.  Constraint  Set  4  included 
two  hard  upper  limits  on  max-acc  and  max-speed  and  two  soft  range  limits  on  force  and 
avg-speed.  It  appears  that  when  the  suggested  weights  for  the  force  and  avg-speed  ranges 
were  combined  via  the  centroid  calculation,  the  force  terms  were  much  more  sensitive  to 
the  change  away  from  their  own  desired  values.  Further,  the  hard  limit  on  max-acc  was 
greatly  exceeded  in  all  initial  cases.  By  the  end  of  the  iterations,  the  hard  limits  were  all 
met,  but  force  was  failed  in  all  cases:  failed  under  the  lower  limit  of  the  range.  By 
requiring  such  a  low  max-acc,  we  were  required  to  use  less  thrust  than  specified  by  the 
soft  range.  Similarly,  all  final  avg-speeds  failed  low,  as  the  trajectory  had  to  go  slowly 
enough  to  meet  the  upper  limit  on  max-speed  (a  hard  limit).  Essentially,  the  stated  soft 
constraints  in  Set  4  had  to  fail  for  the  hard  constraints  to  be  met.  (This  constraint  set, 
when  combined  with  Obstacle  Set  4,  failed  to  converge  for  either  the  default  or  initialized 
case.  So  only  results  from  Obstacle  Sets  1-3  have  been  considered  here.) 

Constraint  Set  5  added  the  soft  fuzzy  preference  “moderately  safely”  to  Constraint 
Set  4.  This  defuzzified  into  four  more  soft  range  constraints.  We  were  practically 
guaranteed  a  certain  failure  rate,  since  the  upper  limit  of  “safely ’s”  avg-speed  constraint 
equaled  the  lower  limit  of  the  soft  range  constraint  on  avg-speed  from  Constraint  Set  4. 
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Of  course,  if  we  failed  low  on  avg-speed,  as  we  did  for  all  of  the  Constraint  Set  4  cases, 
we  would  be  making  the  “safely”  constraint,  decreasing  our  overall  failure  rate.  Soft 
constraints  on  max-acc  and  max-speed  arising  from  “safely”  were  sometimes  met  when 
the  hard  constraints  were  met,  again  decreasing  the  failure  rate.  (And  when  they  were 
failed,  they  failed  low  as  in  Constraint  Set  4.)  We  saw  very  large  failure  margins  which 
were  greatly  reduced  by  the  end  of  the  iterations. 

Constraint  Set  6  was  “very  energy-saving,”  which  decomposed  into  soft  range 
constraints  on  force  and  torque.  But  since  the  only  torque  needed  in  the  trajectories  was 
that  required  to  avoid  obstacles,  the  trajectories  all  failed  low;  they  could  not  use  enough 
torque  to  satisfy  the  “low  torque ”  constraint.  “Low  force”  was  more  typically  made,  or 
failed  high  with  very  small  margins  (0.05,  0.02  N)  for  the  completed  cases  {Default,,, 
INITn).  Since  the  nonlinearity  of  the  system  is  in  its  rotational  dynamics,  and  since  we 
wish  to  show  that  our  architecture  will  work  with  nonlinear  systems,  we  decided  to 
rewrite  the  Xf  to  include  explicit  rotational  changes  rerun  it  over  the  set  of  obstacles  under 
Constraint  Set  6.  The  results  are  presented  separately  in  Section  5.4.2,  Torque,  below. 

Margins  of  success  for  H°  were  overall  much  better  than  in  the  2DOF  case.  Using 
Eq.  4.6  to  define  margin h,  we  had  many  more  cases  where  the  margins  were  very  near  1, 
as  shown  in  Figure  49. 
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Figure  49:  Margins  of  success  for  hard  limits  for  6DOF  cases 
Recall  that  we  assume  that  we  would  like  to  stay  as  well  under  H°  as  possible.  Here  we 
have  a  full  2/5  of  all  solutions  having  features  less  than  90%  of  the  H°  values.  We  again 
see  that  the  final  solution  quality  (with  respect  to  H°)  does  not  depend  significantly  on  the 
starting  point  of  the  solution.  We  continue  to  interpret  this  as  a  sign  of  the  robustness  of 
the  overall  architecture. 

Figure  50  shows  the  margins  of  success  for  S°  in  these  cases,  margins  is  defined 
in  Eq.  4.10.  More  of  the  data  lies  at  the  edges  of  the  histogram  than  in  the  2DOF  case. 
Thirteen  of  the  cases  binned  at  -0.9  and  -0.8  (for  both  INIT '  and  Default ”)  resulted  from 
Constraint  Set  1,  which  required  “high  max-speed ’  as  a  soft  limit  but 
“ max-speed  <  5.5  m/s”  as  a  hard  limit.  The  range  for  “high  max-speed’’  was  defined  for 
the  6DOF  domain  as  5.0  <  max-speed <  10.6.  The  result  is  that  the  architecture  takes  full 
advantage  of  the  fuzziness  of  the  soft  constraint  and  drives  it  well  into  the  low  end  of  that 
range  to  meet  the  more  restrictive  hard  constraint. 
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Figure  50:  Margins  of  success  for  soft  limits  for  6DOF  cases 
Figure  51  shows  the  margins  for  S°  failures,  calculated  according  to  Eq.  4.1 1. 
Unlike  Figure  31,  there  is  no  tail  of  high  margins.  In  this  respect,  we  had  better 
performance  in  the  complex  6DOF  domain  than  in  the  2DOF  domain.  Again,  final 
results  are  not  much  affected  by  initialization. 
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Figure  51:  Margins  of  failure  for  soft  limits  for  6DOF  cases 
5.4.3  WADJ  Performance 

For  many  of  the  runs,  a  solution  was  returned  after  between  one  and  three 
iterations  (Figure  46).  For  one  or  two  iterations,  the  weights  are  converged  upon  with  no 
overshoot.  For  those  default  weight  cases  that  took  three  iterations,  some  converged 
monotonically  to  the  correct  weights,  while  others  overshot  on  the  second  iteration  and 
corrected  with  the  third.  How  did  the  system  behave  for  more  complex  constraint  sets? 
Figure  52  shows  a  trace  for  the  W3/W1  and  LIM  weights  for  the  initialized  run  through 
Constraint  Set  5,  Obstacle  Set  1. 
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Figure  52:  Weight  values  evolving  through  Constraint  Set  5,  Obstacle  Set  1  starting  from  TNIT1 


Constraint  Set  5  had  hard  upper  limits  on  max-speed  and  max-ace,  soft  limits  on 
force  and  avg-speed,  and  the  fuzzy  constraint  “moderately  safely.”  It  was  nearly 
impossible  to  satisfy  the  soft  requirements  on  avg-speed,  which  would  only  be  met  by 
achieving  an  avg-speed  of  1.8  m/s  exactly.  The  initial  guess  for  W3/W1  greatly 
underpredicted  the  max-acc  for  the  trajectory,  and  WADJ  adjusted  that  weight  down  from 
4.0  to  0.03  to  compensate.  At  the  same  time,  the  minimum  min-sep  entailed  by  “safely” 
was  not  met,  so  LIM  was  increased. 

In  the  second  iteration,  the  hard  limits  and  the  min-sep  soft  limit  were  all  met,  but 
the  remaining  soft  range  limits  were  all  failed  low.  The  system  was  too  slow  and  used 
too  little  force.  Since  the  current  weight  vector  Q2  satisfied  H°,  it  was  stored  as  a 
possible  return  value.  Then  the  routine  continued  to  see  if  it  could  also  meet  the  S°. 
W3/W1  was  increased  again  to  try  and  meet  these  soft  limits. 
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That  failed  the  max-acc  hard  upper  limit  again  and  the  min-sep  soft  limit.  W3/W1  was 
decreased  to  0.01  and  LIM  increased  even  further  to  get  the  weights  for  the  fourth 
iteration  Q4 . 

This  pattern:  decreasing  W3/W1  and  increasing  LIM,  meeting  H°  but  not  all  S°,  and 
then  readjusting  Wj/Wj  to  a  higher  value  was  repeated  in  iterations  4  through  6.  After 
iteration  6,  the  first  attempt  at  WADJ  caused  a  loop  in  the  weights.  So  EVAL  reverted  to 
the  best  weights  found  thus  far,  Q2,  and  called  the  secondary  WADJ  heuristic  rule.  This 
decreased  LIM  by  10%  while  leaving  Wj/Wj  unchanged. 

Although  the  results  of  the  second  and  seventh  iteration  weight  vectors  are 
identical  to  our  levels  of  precision,  the  EVAL  routine  found  some  small  differences  and 
judged  Q2  and  its  results  superior.  When  the  seventh  iteration  returned  with  S°  failures, 
our  timejimit  was  reached  and  Q2  and  its  resulting  trajectory  were  returned. 

5.4.4  Torque 

In  our  original  data  set,  only  one  of  eight  completed  trajectories  (4  Default" ,  4 
INIT')  met  the  “low  torque ”  requirement.  The  rest  were  too  low  to  be  considered  “low” 
by  the  standards  of  our  fuzzy  rule  set.  Since  none  of  our  goal  states  required  a  rotation, 
the  only  rotations  required  were  those  needed  to  avoid  obstacles.  These  did  not  use 
sufficient  torque  to  be  considered  low.  So,  to  test  the  torque  WADJ  rules,  we  included  a 
rotation  change  in  each  axis  at  the  goal  state  and  re-ran  the  tests. 

We  also  had  some  concerns  about  the  possible  interaction  of force  and  torque.  In 
WADJ,  all  features  except  torque  are  first  checked  for  adjustment.  Then  the  selected 
Wj/Wj  ratio  is  used  together  with  the  desired  torque  value  to  calculate  a  slope  from 
Figure  36;  that  slope  is  then  used  to  pick  a  W/Wi  ratio  via  the  equation  in  Figure  37.  The 
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“low  force ”  requirement  was  keeping  W3/W1  small,  and  the  heuristic  is  less  well- 
conditioned  for  W3/W1  less  than  1.  Although  meeting  mixed  constraints  is  an  important 
goal,  we  also  wanted  to  isolate  the  torque  response  to  the  WADJ  process,  since  it  is  so 
different  from  the  other  WADJ  heuristics.  So  we  created  additional  test  cases:  Constraint 
Set  7,  “low  torque ,”  and  Constraint  Set  8,  “medium  torque 

Figure  53  shows  our  overall  failure  rates  for  these  three  constraint  sets  (CS  7,  8, 
and  the  revised  CS  6);  each  case  had  16  runs  (four  soft  constraints  run  over  four  obstacle 
sets).  The  Default1  and  INIT1  cases  are  high  again,  not  unexpectedly,  and  the  failure  rates 
for  the  completed  runs  are  much  lower.  All  seven  failures  at  run  completion  were  torque 
failing  low;  of  those  seven,  four  were  from  the  “medium  torque”  Constraint  Set  8.  W2/W1 
was  continually  adjusted  down  to  discount  it,  to  allow  for  greater  torque  in  these  cases, 
but  what  was  required  was  that  W3/W1  be  increased.  Since  there  was  a  tacit  assumption 
that  some  other  state  feature  would  be  relying  on  W3/W1  and  that  it  may  have  been 
adjusted  to  affect  that  other  feature,  the  torque  WADJ  never  altered  W3/W1,  and  W2/W1 
could  not  be  adjusted  sufficiently  before  timejimit  was  reached.  Our  current  TPLAN 
cannot  handle  direct  maximization  of  trajectory  qualities;  it  can  only  minimize.  We  have 
found  that  we  can  minimize  features  which  are  inversely  related  to  our  feature  of  interest 
for  a  maximizing  effect;  thus  by  penalizing  time,  we  can  usually  force  in  increase  in 
speed.  Another  TPLAN  might  allow  for  direct  maximization  of  features. 
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Figure  53:  6DOF  S°  failures  for  Constraint  Sets  6,  7  and  8 


Figure  54  shows  the  number  of  iterations  required  for  these  runs.  All  of  those 
runs  which  took  four  or  fewer  iterations  to  return  a  solution  returned  a  complete  success. 
The  utility  of  INIT  is  again  shown  in  the  large  number  of  runs  that  returned  successful 
trajectories  after  only  one  or  two  iterations;  eight  (2/3  of  the  total)  of  the  trajectories 
created  using  INIT  were  solved  in  two  or  fewer  iterations,  while  only  3  trajectories 
created  using  default  weights  met  this  standard. 
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Figure  54:  Number  of  iterations  required  for  Constraint  Sets  6,  7  and  8 

The  margins  of  success  on  these  runs  showed  no  particular  trends.  The  margins 
of  failure  on  the  cases  run  to  completion  were  all  small,  with  the  normalized  magnitude 
of  the  largest  being  0.51;  the  smallest  on  Constraint  Set  6,  Obstacle  Set  3,  which  failed 
low  on  torque  by  a  normalized  margin  of  only  -0.07. 

5.5  Conclusions 

In  this  chapter  we  have  demonstrated  that  WADJ  heuristics  can  be  developed  for  a 
deep  space  6DOF  domain  with  nonlinear  dynamics.  Our  results  were,  if  anything,  better 
in  the  6DOF  domain  than  in  our  2DOF  domain,  with  smaller  S°  failure  margins  and 
larger  success  margins  for  H°.  The  average  number  of  iterations  required  to  find  a 
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solution  was  commensurate  with  the  2DOF  case,  also  arguing  that  the  technique 
implemented  in  the  architecture  will  scale  well  with  the  dynamic  complexity  of  the 
domain.  The  surprising  similarity  of  the  2DOF  and  6DOF  WADJ  curves,  even  to  the 
values  of  the  coefficients,  is  noteworthy.  That  also  argues  for  the  potential  for  a  general 
application  to  the  optimization  of  dynamic  systems. 

The  performance  of  our  TPLAN,  BVP4C2,  was  less  robust  and  far  more  slow  than 
we  had  hoped.  Prior  work  on  this  algorithm  required  extensive  hand-tuning  of  several 
sets  of  gains  just  to  solve  a  single  trajectory  problem.  We  were  running  it,  on  average, 

5.2  times  per  problem  for  24  fairly  different  problems.  So  these  difficulties  are  not 
entirely  unexpected.  In  the  future,  however,  a  different  TPLAN  should  be  selected  for 
work  with  nonlinear  systems. 


123 


6  Conclusions  and  Future  Work 


The  examples  in  Chapter  4  and  5  have  shown  that  our  architecture  from  Chapter  3 
to  optimize  trajectories  over  hard  constraints  and  natural  language  preferences  can  indeed 
achieve  our  requirements: 

•  The  WADJ  heuristics  consistently  direct  the  weights  toward  values  that  meet  hard 
and  soft  constraints  and  are  robust  to  differences  in  initial  weight  sets 

•  The  fuzzy  logic  enables  a  more  natural  human  interface,  opening  a  route  to  easy 
tasking  of  autonomous  agents  by  non-expert  users  (e.g.,  hospital  staff 
commanding  a  robotic  assistant,  warfighters  with  a  Future  Combat  System  robot, 
the  elderly  using  a  companion  robot). 

•  The  ability  to  meet  hard  numeric  constraints  is  not  lost  in  adding  the  fuzziness. 
This  allows  the  system  to  be  used  as  an  “automated  graduate  student,”  overseeing 
trajectory  generation,  rejecting  those  which  do  not  meet  required  hard  constraints, 
and  making  intelligent  adjustments  to  the  weights  to  move  the  solution  in  the 
required  direction. 

•  Proper  weight  initialization  saves  computation  time. 

o  One  iteration  saved  on  average  in  2DOF 
o  60%  of  6DOF  cases  solved  in  under  3  iterations 

•  Substantial  knowledge  engineering  and  preprocessing  was  required  to  develop  the 
fuzzy  rules,  the  WADJ  rules,  and  the  TPLAN  implementation. 

•  But  once  this  offline  process  was  completed,  the  system  was  applicable  to  a  wide 
range  of  obstacle  and  constraint  conditions  with  no  further  adjustments. 
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•  This  makes  the  architecture  useful  for  robots  operating  long-term  in  a  consistent 
environment,  but  not  so  useful  for  “one  off”  operations  such  as  technology 
demonstrations. 

In  all  of  our  86  cases  run  to  completion,  every  hard  constraint  placed  on  the  trajectory 
was  met.  No  maximum  accelerations  or  velocities  were  ever  exceeded,  and  no  paths 
through  obstacles  were  ever  returned. 

Our  test  runs  were  specifically  constructed  to  include  competing  soft  constraints 
so  that  it  was  not  possible  to  satisfy  them  all.  Our  architecture  did  manage  to  balance 
these  competing  constraints  well  overall,  returning  trajectories  that  had  typically  small 
margins  of  failure.  A  few  soft  constraint  failures  were  notably  larger;  future  work  should 
discover  if  this  is  the  result  of  fuzzy  rules  degradation  in  the  presence  of  obstacles  (as 
seems  likely). 

The  techniques  developed  for  construction  of  WADJ  curves  that  the  architecture 
uses  to  adjust  cost  functional  weights  and  affect  trajectory  features  were  generalizable 
from  the  linear  2DOF  case  to  the  nonlinear  6DOF  case,  although  more  manipulation  and 
insight  was  required  of  the  6DOF  case. 

The  INIT  routine  was  somewhat  useful  in  the  2DOF  linear  case,  but  on  average 
made  little  difference  in  the  6DOF  nonlinear  case,  which  was  surprising.  It  did  allow,  in 
certain  constraint/obstacle  combinations  where  there  were  no  competing  constraints,  the 
one-iteration  solution  of  the  problem,  which  a  default  weight  vector  never  achieved.  It 
could  also  set  the  architecture  up  for  a  cycle  between  two  opposing  weights  that  would 
take  longer  to  resolve  than  the  default  weights  typically  did.  Those  problems  with  fewer 
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and  non-competing  constraints  more  frequently  achieved  the  one-step  solution;  certainly 
IN  IT  should  continue  to  be  used  for  problems  with  similar  characteristics. 

A  substantial  knowledge  engineering  effort  was  required  to  develop  the  WADJ 
curves  and  fuzzy  rules  sets.  Once  this  was  done,  the  resulting  heuristics  were  useful  over 
a  range  of  constraint  and  obstacle  sets.  While  the  work  required  to  set  up  WADJ  and  an 
appropriate  INIT  is  not  trivial,  it  is  far  less  than  the  time  required  to  compute  the  Pareto 
front  for  the  trajectory  planning  problem.  Once  the  framework  for  developing  WADJ 
rules  and  the  databases  used  by  INIT  was  developed,  the  domain-specific  development 
could  be  accomplished  within  a  few  days.  This  is  because  the  simulations  were  run  in 
empty  or  mostly-empty  fields,  for  which  the  trajectory  planning  problem  solves  quickly. 
The  architecture  was  then  ready  to  run  trajectory  planning  problems,  returning  good 
solutions  after  an  average  of  five  or  six  iterations  of  trajectory  generation.  A  run  of 
twenty-four  iterations  was  considered  anomalously  high.  For  a  Pareto  front  to  be 
developed  for  a  single  point-to-point  traversal,  the  EMO  methods  would  have  to  generate 
and  evaluate  hundreds  or  thousands  of  candidate  trajectories.  In  a  cluttered  field,  for  a 
nonlinear  system,  using  a  computationally-intensive  solver  like  BVP4C,  each  trajectory 
generation  can  take  hours. 

For  one-off  technology  demonstrations  or  single-mission  robots,  this  approach 
may  not  be  the  right  one.  In  those  cases,  effort  spent  creating  a  generic  profile  of  the 
system  might  be  better  invested  in  carefully  engineering  the  one  particular  trajectory  of 
interest.  This  architecture  is  useful  when  a  robot  is  going  to  be  active  in  the  same  domain 
for  a  long  time,  performing  a  variety  of  missions  under  different  dynamic  constraints. 
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6.1  Future  Work 


6.1.1  Improving  INIT 

A  more  sophisticated  version  of  INIT  could  look  at  the  constraint  set  as  a  whole 
and  recognize  potential  conflicts.  Currently,  the  system  will  go  through  many  iterations 
trying  to  satisfy  conflicting  constraints.  If  the  conflicting  constraints  are  both  hard 
constraints,  it  could  be  many  iterations  before  a  weight  cycle  is  detected.  (In  this 
research,  the  only  such  cycle  took  24  iterations  to  be  detected).  An  early  detection  of  this 
kind  of  possible  conflict,  or  else  a  software  monitor  that  detects  a  pattern  of  cycling  back 
and  forth  for  “too  many”  iterations  (where  “too  many”  may  be  set  at  the  user’s  discretion) 
would  both  be  useful  to  have. 

INIT  should  also  be  invariant  to  the  order  in  which  constraints  are  processed.  In 
this  implementation,  the  order  in  which  the  hard  constraints  are  considered  impacts  the 
returned  weight  vector.  After  the  soft  constraints  have  been  aggregated  via  a  centroid 
computation,  INIT  cycles  through  the  hard  constraints  and  checks  to  see  if  the  currently 
suggested  weights  are  liable  to  meet  them.  If  they  are  not,  INIT  adjusts  the  weights  up  or 
down  as  needed.  If  competing  constraints  are  being  considered,  the  last  one  addressed  by 
INIT  will  be  favored,  rather  than  a  median  weight  which  might  satisfy  both.  Given  that 
we  do  not  know  the  number  nor  type  of  our  hard  constraints  a  priori,  this  problem  might 
be  better  addressed  within  a  cognitive  architecture,  where  its  pattern-matching  abilities 
would  be  very  useful. 

6.1.2  Improving  WADJ 

After  the  INIT  cycle,  the  “adverbial  modifiers”  like  “very”  or  “somewhat”  are  lost 
in  the  weight  adjustment  process.  The  endpoints  of  the  fuzzy  regions  for  the  soft 
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constraints  are  fixed,  without  regard  to  the  strength  of  the  user’s  preference.  Nor  are  they 
currently  considered  when  deciding  which  of  several  competing  soft  constraints  must  fail. 
The  assumption  has  been  that  the  INIT  process  would  put  the  solution  in  approximately 
the  correct  region  in  weight-space,  and  further  iterations  would  reflect  that.  That 
assumption  does  not  necessarily  hold,  as  the  WADJ  rules  can  cause  oscillations  of 
initially  very  large,  then  decreasing,  magnitude  in  weight  space.  Something  that 
preserves  the  knowledge  of  soft  preference  strength  past  the  INIT  phase  would  help  this 
adhere  more  closely  to  true  user  preference,  and  perhaps  reduce  total  iterations  needed. 

A  more  sophisticated  notion  of  error  margins  in  FEXT  might  also  be  of  use  here. 

A  WADJ  algorithm  that  seeks  to  minimize  the  entire  vector  of  errors,  rather  than  each 
error  individually,  would  be  computationally  more  expensive  (an  optimization  within  an 
optimization)  but  could  yield  superior  results  with  fewer  iterations. 

Finally,  other  forms  of  WADJ  specific  to  other  cost  functionals  could  be  explored. 
A  cost  functional  based  on  a  linear  quadratic  regulator  (LQR),  in  which  components  of 
the  state  vector  like  the  velocities  are  directly  penalized,  could  replace  the  time 
component  of  the  cost  functionals  used  here.  Of  course,  these  new  tenns  would  still  have 
weighting  terms  and  the  relationships  between  them  would  have  to  be  investigated, 
following  the  procedures  outlined  here. 

6,1.3  Improving  EVAL 

We  would  like  to  augment  EVAL  with  an  understanding  of  the  adverbial 
modifiers,  as  mentioned  above,  so  that  preferences  the  user  described  as  weaker  would  be 
violated  in  favor  of  meeting  more  strongly-held  preferences.  Additionally,  some 
mechanism  whereby  the  original  set  of  limits  L°  can  be  revisited  and  perhaps  altered  by 
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the  architecture  is  an  avenue  of  further  research.  There  could  be  cases  where  the  slight 
easement  of  a  limit  could  lead  to  an  overall  acceptable  solution;  we  would  like  to  be  able 
to  identify  these  cases  and  flag  them  for  the  user.  In  this  vein,  the  addition  of  “firm” 
versus  “hard”  or  “soft”  constraints  might  be  considered:  those  constraints  which  the  user 
very  greatly  prefers  to  be  met,  but  which  do  not  indicate  total  failure  if  failed. 

6.1.4  Developing  the  Fuzzy  Rule  Database 

The  fuzzy  rules  used  in  this  research  were  created  in  an  ad  hoc  fashion.  In 
practice,  they  would  possibly  be  generated  in  a  more  principled  way.  For  a  robot  that 
was  to  interact  with  the  general  public,  a  human  user  survey  could  be  conducted  to  leam 
what  the  user  would  consider  a  typical  robot  response  to  commands  such  as  “come 
quickly”  or  “follow  carefully.”  These  expectations  would  then  be  incorporated  into  the 
fuzzy  rules  database. 

If  the  robot  was  to  be  a  personal  assistant,  then  either  some  programmable 
interface  or  else  an  automated  machine  learning  technique  could  be  implemented  to  allow 
the  robot  to  more  closely  conform  to  a  single  user’s  expectations. 

6.1.5  Application  to  Other  Adjustable  Parameters 

Many  numerical  solution  techniques  used  today  involve  the  adjustment,  usually 
by  hand  and  by  good  judgment,  of  certain  parameters.  They  include  parameters  such  as 
the  continuance  schedule  gains  used  for  BVP4C  and  BVP4C2  solutions  or  the  rates 
controlling  mutation  and  crossover  in  an  evolutionary  algorithm.  Can  the  human 
judgment  gained  by  trial  and  error  for  these  parameters  be  distilled  into  some  sort  of 
adjustment  curves,  as  the  cost  functional  weights  were  distilled  in  the  WADJ 
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calculations?  Can  we  make  plots  of  “algorithm  convergence  vs.  parameter?”  This  would 
serve  to  generalize  this  work  beyond  the  optimal  control  community,  if  it  could  be  done. 

Some  optimization  routines  use  negative  weights  in  the  cost  functional,  to  allow 
certain  terms  (e.g.,  a  quality  measure)  to  be  maximized.  Users  must  be  very  careful  when 
doing  this,  because  it  becomes  possible  for  the  term  to  grow  without  bound  as  time  goes 
to  infinity.  The  cost  goes  to  negative  infinity,  dominated  by  this  term  times  its  negative 
constant.  If  the  user  has  detennined  that,  due  to  the  properties  of  his  particular  problem, 
this  will  not  happen,  then  such  a  term  can  be  used.  This  work  does  not  investigate  the 
possibility  of  adding  such  terms,  and  we  could  look  to  that  in  the  future  as  well. 

6.2  Final  Summary 

In  this  dissertation,  we  have  developed  an  architecture  which  brings  together 
several  tools  for  the  autonomous  generation  of  preference-optimized  trajectories.  Both 
hard  and  soft  constraints  are  handled,  with  tradeoffs  between  soft  constraints  being  made 
in  an  intelligent  fashion.  This  intelligence  comes  from  cost  functional  weight  adjustment 
guidelines  developed  from  domain-specific  data,  and  the  methods  for  collecting  and 
interpreting  this  data  that  have  been  developed  seem  to  be  generalizable  from  linear  to 
nonlinear  domains.  We  have  been  able  to  find  no  prior  work  that  addresses  this  problem 
of  weight  selection  and  adjustment  in  anything  other  than  an  ad  hoc  fashion,  much  less 
provide  an  overarching  framework  for  its  application  to  a  variety  of  domains.  As  we 
require  humans  to  interact  with  and  task  robots  to  perform  autonomous  missions,  whether 
on  land,  sea,  air,  or  in  space,  we  will  need  ways  for  the  robot  to  understand  the  human 
user’s  requirements  on  it.  This  work  is  a  step  in  that  direction. 
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Appendix  A:  Features  and  Limits 


Features 

The  following  trajectory  features  were  used  or  considered  for  use  in  this  research: 
max-speed,  the  maximum  forward  speed  of  the  vehicle  over  the  trajectory 
avg-speed,  the  average  forward  vehicle  speed  over  the  trajectory 
max-ace,  the  maximum  absolute  value  of  the  vehicle  acceleration  over  the  trajectory 
avg-acc,  the  average  absolute  value  of  the  vehicle  acceleration  over  the  trajectory 
speed-plat,  a  measure  of  how  long  the  vehicle  maintained  a  constant  speed  (plateau); 

taken  to  be  a  region  where  speed  does  not  vary  by  more  than  +/-  0.5% 
(or  1%  total)  of  its  overall  range.  (Condition  1)  and  which  extends  for  at 
least  10%  of  the  time  domain.  (Condition  2) 
acc-plat,  defined  as  speed-plat  but  for  acceleration 

min-sep,  the  minimum  separation  of  the  path  from  each  obstacle  in  the  environment 
max-rot-vel,  avg-rot-vel,  and  rot-plat-vel  were  not  considered,  although  they  could  have 
been  defined  as  the  tenns  above. 

Additionally,  the  cost  functional  tenns  were  considered  features.  In  the  2DOF 
domain,  these  were: 

energy,  the  electrical  energy  used  to  move  the  vehicle  forward 
time,  the  time  taken  for  the  trajectory  to  complete 
obstacle _pena/ty,  as  described  in  Eq.  4.5 

In  the  6DOF  domain,  the  cost  functional  feature  terms  were: 
force,  the  force  exerted  by  the  thrusters 
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torque,  expressed  as  an  electrical  minimum-energy  term 
time,  time  taken  for  the  trajectory  to  complete 
obstacle  penalty 

Limits 

Hard  or  soft  numeric  limits  could  be  placed  on  any  of  the  features.  In  practice,  we 
used  only  upper  for  lower  hard  limits  and  soft  range  limits.  That  is,  for  hard  limits,  we 
required  that  the  feature  value  be  above  or  below  some  value.  For  soft  numeric  limits,  we 
required  that  the  feature  be  between  two  values.  However,  the  architecture  handles  soft 
upper  or  lower  limits  as  well  (although  the  centroid  calculation  in  INIT  would  have  to  be 
modified  for  best  results). 

The  other  type  of  limit  used  was  the  soft  word  constraints.  We  used  words  like 
“quickly”  or  “safely”  as  optimization  parameters.  To  translate  these  words  into 
something  the  architecture  could  use,  we  broke  them  down  into  sets  of  fuzzy  feature 
values  on  different  state  features.  These  definitions  are  applied  across  domains: 

“quickly”  means  “high  max-speed’'’  and  “high  avg-speed ’  everywhere  it  is  used. 

However,  the  actual  value  in  meters  per  second  that  constitute  high  speed  in  a  given 
domain  may  change.  They  are  loaded  from  a  domain-specific  file  in  our  implementation. 

We  defined  the  following  words  for  this  research,  although  not  all  were  used  in 
the  results  presented  in  this  dissertation: 

•  Boldly  =  {low  min-sep,  high  speed-plat,  high  acc-plat,  medium  high  avg-speed} 

•  Cautiously  =  {low  min-sep,  low  max-speed,  low  avg-speed,  low  max-acc,  low 

speed-plat,  low  acc-plat } 

•  Efficiently  =  {low  force)  (for  6DOF)  or  {low  energy}  (for  2DOF) 
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•  Energy-saving  =  {low  force,  low  torque }  (for  6DOF  only) 

•  Inquisitively  =  {low  min-sep,  medium-low  avg-speed} 

•  Quickly  =  {high  max-speed,  high  avg-speed} 

•  Safely  =  {medium  high  min-sep,  low  max-speed,  low  avg-speed,  low  max-acc) 

•  Stealthily  =  {low  min-sep,  medium  low  avg-speed } 

Although  WADJ  heuristic  development  results  seemed  to  show  that  speed-plat  and  ace- 
plat  could  be  predictably  controlled,  that  proved  to  not  be  the  case  in  the  presence  of 
obstacles.  In  very  open  or  uncluttered  terrain,  one  might  expect  to  be  able  to  command 
them,  but  not  otherwise. 

We  additionally  defined  a  large  number  of  verbs  and  adverbs,  correlating  them  to 
upper,  lower,  or  range  limits  on  a  host  of  feature  values,  including: 

speed  smoothness,  a  hypothetical  measure  of  the  rate  of  change  in  speed  over  the 
trajectory;  perhaps  the  linearity  of  the  acceleration 
acceleration  smoothness,  a  hypothetical  measure  of  the  rate  of  change  of  the 
acceleration  over  the  trajectory 

•  sign  changes  of  acceleration :  the  number  of  times  the  vehicle  changes  from 

accelerating  to  decelerating  or  visa  versa 
length,  path  length;  perhaps  nonnalized  with  respect  to  the  straight-line  distance 
between  start  and  goal 
speed  near  obstacle 

•  sign  changes  in  rotational  velocity 
rotational  velocity  smoothness 
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max-rot-acc,  avg-rot-acc,  rot-acc-plat,  as  other  max,  avg,  and  plat  values  for 
rotational  acceleration 
#  sign  changes  in  rotational  acceleration 
rotational  acceleration  smoothness 
target  acqu,  the  precision  with  which  a  target  is  acquired 

under  cover,  the  percent  of  the  trajectory  the  vehicle  remains  in  terrain  known  to  be 
cover 

The  definitions  are  given  below.  The  state  features  involved  with  each  word  limit  are 
marked  with  an  x. 
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acc 


max  avg  speed  speed 
speed  speed  plat  smoothness 


max  acc  min  acc  avg  acc  acc  plat 


#  sgn  chg 
acc 


smoothness 


length  min  sep 


speed  max 
near  obst  rotvel 


avg 

rotvel 


rotvel  lat  #  sn9  c^9  rotvel  max  min  rotacc  #  sng  chg  rotacc  target  under 

^  rotvel  smoothness  rotacc  rotacc  plat  rotacc  smoothness  acqu  cover 


accurately 
precisely 
agilely,  nimbly 
quickly,  fast 
briskly 
boldly 

conspicuously 

stately  x 

directly,  straight 
efficiently 
curiously,  all 
curiously,  environment 
curiously,  objects 


carefully 
cautiously 
slowly  x 

ploddingly  x 

meanderingly  x 

leisurely  x 


normally 
gently 
roughly 
gracefully 
simply,  just 


softly,  quietly  x 

stealthily  x 

loudly  x 

easily 

uneasily,  hesitantly 
shyly 

charge  x 

drift  x 

shamble,  shuffle  x 

mingle  x 

walk  x 

jog  x 
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Appendix  B:  Test  Cases  and  Margin  Data 


2DOF  Cases 

Start  Point,  Goal  Point,  and  Obstacle  Sets 

The  state  x  is  a  4-vector  [position,  velocity]  in  the  x-y  plane.  In  all  cases: 
xo  =  [-5  -5  0  0] 

Xf  =  [5  5  0  0] 

Circular  obstacles  are  denoted  as  a  set  containing  their  center  in  the  plane  (x,  y)  and  their 
radius  r,  all  in  meters.  The  set  of  obstacles  is  denoted  {O}. 

Obstacle  Set  1:  {O}  =  {  {(0.5,  -0.5),  1},  {-4.0,  -2.5},  0.5},  {(4.0,  3.0),  0.5}  } 


Obstacle  Set  2:  {O}  =  {  {(1.5,  2.5),  2.0},  {(-4.2,  -3.5),  0.5},  {(-3.2,  -4.0),  0.5}  } 
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4 


Obstacle  Set  3:  {O}  =  {  {(1,  1.2),  3}  } 


o 


-i 

5 


Obstacle  Set  4:  {O}  =  {  {(0.0,  -3),  0.5},  {(4.0,  -3.5),  0.5},  {(-3.5,  0.5),  1.0}, 
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{(-1.0,  0.5),  0.5},  {(1.0,  3.7),  0.5},  {(4.0,  3.0),  0.5}, 
{(0.5, -0.5),  1.0}} 


5  r 

4 
3 
2 
1 
0 
-1 
-2 
-3 
-4  - 

-5 - 1 - 

-5  0 

Constraint  Sets 

Constraint  Set  1 

H°  =  { max-speed  <  4.2} 

S°  =  {somewhat  quickly} 

Constraint  Set  2 

H°  =  {max-acc  <  1.0,  min-sep  >  1.7} 

S°  =  {safely} 

Constraint  Set  3 

S°  =  {a  little  quickly,  exceedingly  inquisitively} 
Constraint  Set  4 

H°  =  {max-acc  <1.0,  max- speed  >4.0} 


138 


S°  =  {10  <=  energy  <  15,  1.0  <  avg-speed <  2.0} 

Constraint  Set  5 

H°  =  {max  acc  <  1.0,  max-speed <  4.0} 

S°  ={  10  <energy  <  15,  1.0  <  avg-speed  <  2.0,  moderately  safely} 

Margins  of  Success  and  Failure 

Green  cells  represent  margins  of  success.  Margins  are  normalized  as  described  in 
Chapter  4.  For  hard  constraints  (all  uppers  in  these  runs),  a  positive  sign  indicates  how 
far  under  the  limit  the  value  is.  For  soft  constraints,  positive  sign  indicates  that  the 
margin  normalized  to  the  high  side  of  the  center  of  the  fuzzy  range  and  negative  sign 
indicates  that  it  normalized  to  the  low  side. 

Pink  cells  represent  margins  of  failure.  For  hard  failures,  signs  are  positive  and  indicate 
how  far  over  the  limit  the  failure  went.  For  soft  range  failures,  a  positive  sign  indicates 
that  the  upper  limit  was  exceeded  (“failed  high”)  while  a  negative  sign  indicates  that  the 
lower  limit  was  exceeded  “failed  low”) 
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Mar 

gins 

Obstacle  Set  1 

Obstacle  Set  2 

Obstacle  Set  3 

Obstacle  Set  4 

Constraint  Set  1 

INIT0 

INITn 

DefaultO 

Defaultn 

INIT0 

INITn 

DefaultO 

Defaultn 

INIT0 

INITn 

DefaultO 

Defaultn 

INIT0 

INITn 

DefaultO 

Defaultn 

max  speed  <=  4.2 

0.37 

0.08 

0.67 

0.08 

0.10 

0.60 

0.08 

0.75 

0.76 

0.27 

0.87 

0.13 

0.57 

0.04 

0.29 

0.23 

somewhat  quickly 

hi  4  <=  max  speed  <=  8 

-0.68 

-0.07 

-1.32 

-0.06 

-0.69 

-1.15 

-0.06 

-1.49 

-1.50 

-0.47 

-1.74 

-0.18 

-1.41 

-0.98 

-0.50 

-0.39 

hi  2.67  <=  avg  speed  <=  5.33 

-0.44 

-0.81 

-1.14 

-0.81 

-0.62 

-0.95 

-0.93 

-1.39 

-1.36 

-0.20 

-1.62 

-0.77 

-0.80 

-0.50 

-0.53 

-0.04 

Constraint  Set  2 

max  acc  <=  1 

0.25 

0.25 

0.41 

0.25 

0.78 

0.04 

3.78 

0.29 

0.28 

0.28 

0.47 

0.39 

0.08 

0.08 

0.50 

0.44 

min  sep  >=  1.7 

0.02 

0.02 

0.51 

0.02 

0.00 

0.00 

0.43 

0.00 

0.48 

0.48 

0.44 

0.48 

0.02 

0.02 

0.95 

0.02 

safely 

hi  3  <=  min  sep  <=  5 

-0.26 

-0.26 

-1.16 

0.26 

-0.80 

-0.80 

-1.53 

-0.80 

-0.48 

-0.48 

-1.04 

-0.48 

-0.26 

-0.26 

-1.91 

lo  0.5  <=  max  speed  <=  1 .5 

-0.72 

-0.72 

0.74 

-0.50 

-0.56 

-0.92 

4.76 

-0.88 

-0.56 

-0.56 

-0.94 

-0.86 

-0.42 

-0.42 

3.00 

-0.69 

lo  0.33  <=  avg  speed  <=  1 

-0.36 

-0.36 

0.45 

-0.18 

-0.18 

-0.57 

5.25 

-0.48 

0.00 

0.00 

-0.45 

-0.96 

0.36 

0.36 

2.84 

-0.33 

lo  0.33  <=  max  acc  <=  1 

0.27 

0.27 

-1.22 

0.27 

2.33 

0.87 

11.25 

-0.12 

0.15 

0.15 

-0.42 

-0.15 

0.75 

0.75 

1.49 

-0.32 

Constraint  Set  3 

a  little  quickly 

hi  4  <=  max  speed  <=  8 

-1.29 

-0.74 

-1.32 

-0.06 

-1.63 

-1.48 

-0.06 

-0.89 

-1.34 

-1.28 

-1.74 

-0.31 

-1.02 

0.14 

-0.50 

-1.05 

hi  2.67  <=  avg  speed  <=  5.33 

-1.09 

-0.44 

-1.14 

-0.81 

-1.58 

-1.30 

-0.93 

-0.83 

-1.09 

-1.20 

-1.62 

-0.97 

-1.18 

0.14 

-0.53 

-0.90 

exceedingly  inquisitively 

medio  0.67  <=  avg  speed  <=  1.33 

0.67 

6.36 

0.45 

4.82 

-0.30 

-0.61 

4.33 

4.70 

0.67 

-0.64 

-0.45 

4.18 

0.30 

4.58 

1.88 

0.39 

lo  0.3  <=  min  sep  <=  1 

-0.60 

0.77 

0.54 

0.49 

-0.66 

-0.20 

-0.51 

-0.20 

-0.54 

0.20 

0.89 

0.29 

-0.46 

-0.60 

-0.60 

2.11 

Constraint  Set  4 

max  acc  <=  1 

0.41 

0.04 

0.41 

0.04 

3.78 

0.46 

3.78 

0.46 

0.47 

0.04 

0.47 

0.04 

0.50 

0.05 

0.50 

0.05 

max  speed  <=  4 

0.66 

0.76 

0.66 

0.76 

0.03 

0.87 

0.03 

0.87 

0.87 

0.76 

0.87 

0.76 

0.25 

0.46 

0.25 

0.46 

soft  10  <=  energy  <=15 

0.23 

0.10 

0.23 

0.10 

1.62 

0.37 

1.62 

0.37 

0.84 

0.84 

0.84 

-0.05 

0.14 

0.98 

0.14 

0.98 

soft  1  <=  avg  speed  <=  2 

-0.70 

-0.38 

-0.70 

-0.38 

1.52 

-1.06 

1.52 

-1.06 

-0.96 

-0.22 

-0.96 

4.18 

0.90 

-0.28 

0.90 

-0.28 

Constraint  Set  5 

max  acc  <=  1 

0.38 

0.06 

0.41 

0.04 

2.76 

0.01 

3.78 

0.46 

0.06 

0.07 

0.47 

0.03 

0.10 

0.05 

0.50 

0.02 

max  speed  <=  4 

0.74 

0.82 

0.66 

0.76 

0.74 

0.85 

0.03 

0.87 

0.74 

0.77 

0.87 

0.76 

0.82 

0.81 

0.25 

0.76 

soft  1 0  <=  energy  <=  1 5 

2.46 

2.23 

0.23 

1.02 

2.74 

1.62 

1.62 

0.37 

2.08 

2.20 

0.84 

0.84 

4.56 

4.13 

0.14 

0.89 

soft  1  <=  avg  speed  <=  2 

0.32 

-0.58 

-0.70 

-0.38 

-0.38 

-1.04 

1.52 

-1.06 

-0.12 

-0.44 

-0.96 

-0.44 

-1.10 

-0.98 

0.90 

-0.38 

moderately  safely 

hi  3  <=  min  sep  <=  5 

-0.26 

-0.26 

-1.16 

-0.35 

-0.80 

-0.80 

-1.53 

-1.28 

-0.48 

-0.48 

-1.03 

-1.04 

-0.26 

-0.26 

-1.91 

-2.28 

lo  0.5  <=  max  speed  <=  1 .5 

0.12 

-0.54 

0.74 

-0.08 

0.08 

-0.80 

4.76 

-0.92 

0.12 

-0.16 

-0.94 

-0.08 

-0.58 

-0.46 

3.00 

-0.06 

lo  0.33  <=  avg  speed  <=  1 

0.51 

0.12 

0.45 

0.42 

0.51 

-0.57 

5.25 

-0.57 

0.18 

0.36 

-0.45 

0.36 

-0.66 

-0.45 

2.84 

0.45 

lo  0.33  <=  max  acc  <=  1 

1.13 

0.84 

1.22 

0.90 

8.24 

0.99 

11.25 

-0.36 

0.81 

0.78 

-0.42 

0.93 

0.69 

0.87 

1.49 

0.93 

2DOF  Trajectory  Data  by  Constraint  Set 

We  append  here  charts  showing  the  paths  and  trajectories  of  all  40  2DOF  test 
cases.  Each  chart  displays  the  Default n  case  in  green  and  the  INIT '  case  in  blue.  Each 
constraint  set  and  obstacle  set  solution  is  represented  by  two  charts.  The  first  shows  the 
path  that  the  agent  took  through  the  obstacle  field.  The  other  shows  trajectory 
information  over  time.  The  trajectory  infonnation  charts  show  the  x-coordinate  (X),  y- 
coordinate  (Y),  x-velocity  (Xdot),  v-velocity  (Ydot),  x-input  (U)  and  y- input  (V). 

Each  chart  is  labeled  using  “CS”  to  denote  “Constraint  Set”  and  “OS”  to  denote 
“Obstacle  Set.”  So  “CS1,  OS1”  is  the  information  for  Constraint  Set  1,  Obstacle  Set  1. 
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Constraint  Set  1  included  H°  =  {max-speed  <=  4.2},  S°  =  {somewhat  quickly}. 
The  soft  constraints  were  extended  to  S°  =  {4  m/s  <  max-speed  <  8  m/s,  2.67  m/s  <  avg- 
speed  <  5.33  m/s}.  This  constraint  set  was  difficult  because  of  the  small  window  on 
max-speed  that  would  satisfy  both  Hc  'andS0.  The  H°  was  mostly  made  by  very  small 
margins;  for  instance,  Constraint  Set  1,  Obstacle  Set  1  had  a  H°  margin  of  0.08,  which 
means  that  the  actual  max-speed  was  3.864  m/s.  This  was  still  lower  than  the  lower  limit 
on  the  max-speed  soft  constraint,  so  that  returned  a  soft  failure.  An  examination  of  the 
H°  margins  in  the  solution  histories  shows  a  see-saw  pattern  with  the  H°  margins 
shrinking  overall.  At  first,  the  H°  was  met  but  the  S°  was  not;  they  fail  low.  The  time 
weight  was  raised  to  try  to  increase  the  max-speed  and  avg-speed  to  meet  the  S°.  This 
typically  failed  the  H°.  The  time  weight  was  lowered  again,  but  not  as  low  as  it  had  been 
initially.  This  traded  off  some  of  the  margin  on  H°  to  get  closer  to  S°.  Let  S°2  be  the  soft 
constraint  on  max-speed  and  S°2be  the  soft  constraint  on  avg-speed.  Then  the  following 
results  are  for  the  IN  IT'  runs,  showing  the  H°  margins  for  each  iteration  and  the  final 
success/failure  conditions  for  S°: 

CS1,  OS1:  {0.369195,  0.077382}  (failed  S0,,  succeeded  S°2) 

CS1,  OS2:  {0.102730  (failed),  0.702101,  0.615724  (failed),  0.734784,  0.596354} 
(failed  S0/,  S°2) 

CS1,  OS3:  {0.762050,  0.460474  (failed),  0.269521}  (failed  S°;,  S°2) 

CS1,  OS4:  {0.566147,  0.284714  (failed),  0.034656}  (succeeded  S°;,  S°2) 

Obstacle  Set  2  apparently  presented  some  difficulty;  the  solution  required  more  iterations 
from  both  Default n  and  INIT' ,  and  it  was  longer  (in  traversal  time)  than  the  other 
trajectories.  For  the  other  obstacle  sets,  the  trajectory  was  traversed  in  ~5  -  7  seconds. 
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For  Obstacle  Set  2,  it  took  I  Nile's  solution  ~12  seconds  and  Default"'  s  just  over  20 
seconds.  These  runs  did  not  show  more  computational  problems  than  the  others.  This 
seems  to  indicate  that  the  trajectory  planner  found  a  local  minimum  that  represented  a 
different  solution  mode  than  the  other  three. 
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Constraint  Set  2  had  H°  =  {max-acc  <  1.0,  min-sep  >  1.7},  S  —  {safely}. 
“Safely”  expanded  into  soft  range  constraints  on  min-sep,  max-speed,  avg-speed  and 
max-acc.  Due  to  a  bug,  EVAL  did  not  record  failed  soft  min-sep  limits  as  failures.  (Hard 
min-sep  failures  and  other  soft  failures  were  all  detected;  the  line  of  code  to  detect  this 
particular  kind  of  error  was  just  missing  and  not  discovered  until  data  post-processing.) 
While  these  failures  were  counted  in  our  analysis,  it  does  mean  that  the  routine  exited 
early  in  some  cases,  where  if  it  had  continued,  it  might  have  found  a  solution.  That  is, 
however,  doubtful;  the  soft  min-sep  range  was  on  the  order  of  a  quarter  to  half  of  the 
field.  In  all  but  the  single-obstacle  case,  it  would  be  very  impractical  to  get  sufficiently 
far  away  from  the  obstacles  to  not  trigger  that  soft  failure.  This  is  a  case  where  perhaps 
our  ad  hoc  definitions  were  a  bit  off;  a  “high  min-sep'’'’  required  by  “safely”  might  have 
been  set  to  smaller  actual  values. 

IN  IT1  was  particularly  successful  with  this  constraint  set;  three  out  of  four  runs 
ended  after  one  iteration  (all  with  the  soft  min-sep  failure  which  went  undetected  by 
EVAL).  They  all  took  about  30  seconds  (±  about  5  seconds)  to  traverse  their  respective 
obstacle  fields.  Obstacle  Set  2,  which  had  a  W1/W2  weight  ratio  almost  4  times  that  of  the 
others  (17.84  versus  4.67)  took  longer.  Obstacle  Set  2  was  having  a  very  hard  time 
meeting  the  max-acc  hard  limit,  and  kept  raising  Wi/W 2  until  it  was  met.  Default "  did  not 
have  this  problem,  ironically,  because  it  started  with  a  lower  LIM  value.  The  INIT 
procedure  set  LIM  to  3.0  for  the  initialized  runs  because  of  the  “safely”  constraint.  This 
met  the  hard  min-sep  limit  of  1 .7  m  with  room  to  spare.  The  default  value  for  LIM  was 
1 .0,  which  had  to  be  raised  in  all  cases;  but  in  the  Obstacle  2  case,  it  only  had  to  be  raised 
to  1.7  to  meet  the  hard  min-sep  limit  exactly.  This  lower  LIM  meant  a  shorter,  straighter 
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path  that  could  be  traversed  with  lower  accelerations  for  a  much  higher  W1/W2  value  (5.1 
for  Default ”  in  the  OS2  case).  Default "  had  a  similar  problem  with  OS3.  It  started  off 
failing  both  min-sep  and  max-ace  hard  limits,  and  raised  both  W1/W2  and  LIM  until 
Wi/W 2  was  met.  Then  it  raised  LIM  alone  until  min-sep  was  met.  But  Wi/W 2  was  so  high 
that  the  lower  limits  on  the  “low  avg-speed ’  and  “low  max-speed ’  elements  of  “safely” 
were  missed;  WfW 2  was  lowered  (which  is  to  say,  W 2  the  time  weight  was  increased)  so 
that  the  hard  and  soft  limits  were  all  met. 
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Constraint  Set  3  was  self-sabotaging.  It  asked  for  trajectories  with  S°  ={a  little 
quickly,  exceedingly  inquisitively} .  These  expanded  into  high  avg-speed  and  max-speed 
and  medium-low  avg-speed  and  low  min-sep,  respectively.  It  would  not  be  possible  to 
satisfy  both  high  avg-speed  and  medium  low  avg-speed.  The  idea,  of  course,  was  to 
create  a  trajectory  was  that  mostly  inquisitive  but  on  the  fast  end  of  that.  For  OS  2  and  3, 
INIT  seemed  to  do  just  that,  finding  avg-speeds  that  were  within  the  “inquisitively” 
requirements  (although  not  always  on  the  higher  side).  OS  1  and  4  seemed  to  converge 
to  the  “quickly”  requirements  instead.  The  Default  trajectories  more  often  made  the 
“quickly”  constraints  over  the  “inquisitively”  constraints,  possibly  a  result  of  constraint 
ordering.  This  is  why  they  are  so  much  faster  than  the  ones  arising  from  INIT  for  OS  1-3. 
OS  4  gave  Default  problems  as  well,  as  it  did  not  make  any  of  the  constraints  before 
timing  out  of  the  search  procedure.  (All  of  these  runs,  for  INIT  and  for  Default,  exited 
either  with  a  loop  in  the  weights  or  with  the  time  limit  expired.) 
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Constraint  Set  4  had  H°  =  {max-speed  <  4.0  m/s,  max-acc  <  1  m/s2},  S°  =  {  10  J  < 
energy  <  15  J,  1  m/s  <  avg-speed <  2  m/s}.  That  low  acceleration  required  a  high  W1/W2 
weight,  which  meant  that  it  wais  difficult  to  satisfy  the  avg-speed  soft  constraint.  It  was 
also  difficult  to  meet  the  energy  constraint,  even  with  very  high  W1/W2  values  (e.g.,  22; 
that  is,  energy  is  weighted  22  times  more  heavily  than  time).  This  implied  that  something 
may  have  been  off  with  the  weight  adjustment  heuristic  for  energy.  Interestingly,  in  the 
OS  3  cases,  the  W1/W2  ratio  was  adjusted  down,  not  up,  because  the  avg-speed  was  failing 
low.  Because  of  constraint  ordering,  it  was  processed  after  the  energy  constraint  failure 
and  the  net  adjustment  favored  it. 

The  resulting  trajectories  are  identical  in  these  cases  because  the  constraints,  when 
processed  by  INIT,  coincidentally  happen  to  return  the  same  value  as  the  Default  weight 
vector. 
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Constraint  Set  5  was  another  difficult  set  with  competing  soft  constraints.  H°  = 
{max-acc  <  1 .0,  max-speed  <  4.0} ,  S°  =  { 1 0  <  energy  <  15,  1 .0  <  avg-speed  <  2.0, 
moderately  safely} .  So  it  was  the  same  as  Constraint  Set  4  and  had  the  same  difficulties, 
plus  the  soft  constraint  of  “safely”  added.  (That  “safely”  caused  INIT  to  increase  the 
initial  LIM  value  from  1.0  to  3.0,  so  the  trajectory  planner  did  not  start  from  identical 
weight  vectors  as  it  did  for  Constraint  Set  4.)  “Safely”  requires  low  speeds  and  low 
acceleration,  all  of  which  were  under  the  hard  constraint  values,  so  there  was  no 
competition  there:  if  the  trajectory  could  make  those  S°,  it  would  also  make  the  H°  for 
free.  However,  the  lower  limit  of  the  soft  avg-speed  constraint  that  was  stated  explicitly 
was  equal  to  the  upper  limit  of  the  avg-speed  range  arising  from  “safely;”  both  equal  1 
m/s.  Unless  the  trajectory’s  average  speed  could  be  made  exactly  1.0  m/s  (highly 
unlikely),  one  of  those  constraints  would  have  to  fail.  This  is  another  constraint  set 
where  every  run  exhausted  the  time  limit  on  iterations  trying  to  find  acceptable  solutions 
to  the  soft  constraints. 

The  primary  difference  in  the  trajectories  arose  from  the  min-sep  requirement. 
When  INIT  was  run,  LIM  was  set  to  3.0  to  meet  it,  resulting  in  the  longer  trajectories  seen 
for  the  INIT  cases.  Default,  starting  from  LIM=  1.0,  found  more  trajectories  that  stayed 
closer  to  obstacles. 

Notable  is  the  velocity  spike  in  the  middle  of  the  CS  5,  OS  4  Default "  trajectory. 
To  minimize  the  penalty  over  time  for  being  so  close  to  the  obstacles  it  was  passing 
between  at  that  point,  the  vehicle  speeded  up.  This  is  not  necessarily  desirable  behavior! 
For  this  reason,  the  cost  functional  used  in  Chapter  5  penalized  velocity  near  obstacles  as 
well  as  distance  from  them. 
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6DOF  Cases 


Start  Point,  Goal  Point,  and  Obstacle  Sets 

The  state  x  is  a  12-vector  [position,  velocity,  rotation,  rotational  velocity].  The  rotation 
vector  is  the  modified  Rodrigues  vector.  Spherical  obstacles  are  denoted  as  a  set 
containing  their  center  in  space  (x,  y,  z)  and  their  radius  r,  all  in  meters.  The  set  of 
obstacles  is  denoted  {O} 

Obstacle  Set  1 :  One  large  obstacle  in  the  way. 
x0=  [0  0000000000  0] 
xf=  [10.0  10.0  10.5  00000000  0] 

{0}  =  {{(6,  6,  0),  1}  } 
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Xo=  [0  0000000000  0] 
xf=  [16  16  16  00  00  00  00  0] 

{O}  =  {  {(3,  2,  3},  0.5},  {(8,  5,  8),  0.5},  {(14,  12,  12),  0.5}} 


Obstacle  Set  3:  Moving  from  one  end  of  a  line  of  satellites  to  the  other. 
x0  =  [20  19  35  00  000  00  00] 
xf=  [1  1  2  0  0  0  0  0  0  0  0  0] 

{O}  =  {  {(6,  6,  12),  0.5},  {(1 1,  1 1,  22},  0.5},  {(16,  16,  32),  0.5}  } 
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Obstacle  Set  4:  Assuming  a  position  in  a  tetrahedron. 

Xo  =  [20.4  25  12  00000000  0] 
xf=  [25  25  31.53  0  00  00  00  00] 

{O}  =  {  {(20.38,  25,  25),  1},  {(27.31,  21,  25),  1},  {(27.31,  29,  25),  1}  } 
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Constraint  Sets 

Constraint  Set  1 

H°  =  {max-speed <=  5.5  m/s} 

S°  =  {somewhat  quickly} 

Constraint  Set  2 

S°  =  {exceedingly  efficiently} 

Constraint  Set  3 

S°  =  {a  little  quickly} 

Constraint  Set  4 

H°  =  {max-ace  <=  1.3,  max-speed  <=  4} 

S°  =  {  2.7  <= force  <=  5.4,  1.8  <=  avg-speed  <=  4} 
Constraint  Set  5 

H°  =  {max-acc  <=  1.3,  max- speed  <=  4} 
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S°  =  {2.7  <= force  <=  5.4,  1.8  <=  avg-speed  <=  4,  moderately  safely} 
Constraint  Set  6 

S°  =  {very  energy-saving} 

Constraint  Set  7 

S°  =  {low  torque } 

Constraint  Set  8 

S°  =  {medium  torque } 

Margins  of  Success  and  Failure 

This  table  is  formatted  the  same  way  as  the  one  in  the  2DOF  section. 
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Margins  jj 

Obsl 

Obs2 

Obs3 

Obs4 

Constraint  Set  1 

INITO 

INITn 

DefaultO 

Defaultn 

INITO 

INITn 

DefaultO 

Defaultn 

INITO 

INITn 

DefaultO 

Defaultn 

INITO 

INITn 

DefaultO 

Defaultn 

max  speed  <=  5.5 

0.17 

0.07 

0.57 

0.07 

0.06 

0.00 

0.46 

0.09 

0.30 

0.00 

0.28 

0.02 

0.05 

0.00 

0.44 

0.04 

somewhat  quickly 

hi  5  <=  max  speed  <=  10.6 

-0.15 

-0.96 

-0.95 

-0.86 

-0.70 

-0.83 

-0.72 

-0.99 

-0.23 

-0.82 

-0.37 

-0.85 

-0.73 

-0.82 

-0.68 

-0.90 

hi  4  <=  avg  speed  <=  8.6 

-0.04 

-0.89 

-0.91 

-0.89 

-0.48 

-0.64 

-0.62 

-0.88 

0.06 

-0.61 

-1.23 

-0.60 

-0.65 

-0.76 

-1.68 

-0.87 

Constraint  Set  2 

exceedinly  efficiently 

lo  2.7  <=  force  <=  5.4 

0.36 

0.36 

2.00 

0.05 

0.16 

0.99 

3.63 

0.99 

0.98 

0.00 

4.40 

0.00 

-0.17 

-0.17 

0.64 

0.02 

Constraint  Set  3 

a  little  quickly 

hi  5  <=  max  speed  <=  10.6 

-0.15 

-0.96 

-0.95 

-0.96 

-0.70 

-0.70 

-0.72 

-0.99 

-0.23 

-0.23 

-0.37 

0.58 

-0.73 

-0.73 

-0.68 

-0.68 

hi  4  <=  avg  speed  <=  8.6 

-0.04 

0.89 

-0.91 

-0.89 

-0.48 

-0.48 

-0.62 

-0.88 

0.06 

0.06 

-1.23 

0.90 

-0.65 

-0.65 

-1.68 

-1.68 

Constraint  Set  4 

max  acc  <=  1.3 

19.31 

0.08 

17.82 

0.51 

30.56 

0.47 

26.98 

0.45 

42.64 

0.23 

49.79 

0.35 

22.74 

60.72 

26.55 

60.72 

max  speed  <=  4 

0.14 

0.90 

0.42 

0.95 

0.46 

0.94 

0.26 

0.95 

0.79 

0.92 

0.01 

0.93 

0.44 

6.24 

0.23 

6.24 

soft  2.7  <=  force  <=  5.4 

7.73 

-0.85 

2.00 

-1.39 

10.96 

-1.34 

3.63 

-1.45 

13.81 

-1.08 

4.40 

-1.22 

6.39 

47.53 

0.64 

47.53 

soft  1 .8  <=  avg  speed  <=  4 

0.91 

-1.45 

-0.91 

-1.54 

1.08 

-1.53 

-0.30 

-1.55 

2.22 

-1.48 

-0.56 

-1.51 

0.74 

18.16 

-1.50 

18.16 

Constraint  Set  5 

max  acc  <=  1.3 

19.31 

0.08 

17.82 

0.41 

34.02 

0.05 

26.98 

0.09 

43.27 

0.55 

49.79 

0.46 

22.74 

0.50 

26.55 

0.45 

max  speed  <=  4 

0.14 

0.90 

0.42 

0.94 

0.50 

0.92 

0.26 

0.92 

0.83 

0.95 

0.01 

0.94 

0.44 

0.95 

0.23 

0.94 

soft  2.7  <=  force  <=  5.4 

7.73 

-0.85 

2.00 

-1.26 

11.75 

-1.13 

3.63 

-1.12 

14.69 

-1.51 

4.40 

-1.40 

6.39 

-1.56 

0.64 

-1.52 

soft  1 .8  <=  avg  speed  <=  4 

0.91 

-1.45 

-0.91 

-1.52 

0.99 

-1.47 

-0.30 

-1.49 

2.43 

-1.56 

-0.56 

-1.54 

0.74 

-1.54 

-1.50 

-1.53 

moderately  safely 

hi  1.5  <=  min  sep  <=  2.5 

-0.80 

0.34 

-0.80 

0.34 

0.43 

0.77 

-1.20 

1.02 

0.66 

0.39 

-1.18 

0.87 

0.95 

1.45 

0.41 

1.45 

lo  1 .4  <=  max  speed  <=  2.7 

2.88 

-1.54 

0.44 

-1.76 

5.11 

-1.65 

0.42 

-1.68 

7.09 

-1.87 

1.93 

-1.80 

4.70 

-1.82 

0.61 

-1.79 

lo  1<=  max  acc  <=  2.5 

31.87 

-0.73 

29.30 

-0.30 

57.37 

-0.68 

45.17 

-0.76 

73.40 

-0.55 

84.70 

-0.40 

37.82 

-0.46 

44.42 

-0.38 

lo  0.9<=  avg  speed  <=  1.8 

4.67 

-1.56 

0.22 

-1.71 

7.31 

-1.60 

1.71 

-1.64 

10.82 

-1.80 

-0.37 

-1.77 

6.69 

-1.76 

-1.67 

-1.74 

Constraint  Set  6 
very  energy-saving 

lo  torque  0.05  <=  torque  <=  .2 

-0.67 

-0.67 

-0.67 

-0.67 

-0.67 

-0.67 

-0.66 

-0.67 

-0.62 

-0.64 

-0.20 

0.90 

-0.64 

-0.64 

-0.63 

-0.43 

force 

0.32 

0.32 

2.00 

0.05 

0.16 

0.02 

3.63 

0.99 

0.96 

0.87 

4.40 

0.00 

-0.19 

-0.19 

0.64 

0.02 

6DOF  Trajectory  Data  by  Constraint  Set 

We  append  here  charts  showing  the  paths  and  trajectories  of  all  40  2DOF  test 
cases.  Each  chart  displays  the  Default n  case  in  green  and  the  INIT '  case  in  blue.  Each 
constraint  set  and  obstacle  set  solution  is  represented  by  two  charts.  The  first  shows  the 
path  that  the  agent  took  through  the  obstacle  field.  The  second  shows  translational  and 
rotational  rates.  The  third  shows  force  and  torque  inputs. 

Each  chart  is  labeled  using  “CS”  to  denote  “Constraint  Set”  and  “OS”  to  denote 
“Obstacle  Set.”  So  “CS1,  OS1”  is  the  information  for  Constraint  Set  1,  Obstacle  Set  1. 
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Path  for  CS  1,  OS  2 
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Path  for  CS  1,  OS  3 
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Path  forCS  1,  OS  4 


INITn 

Defaultn 


Velocities  and  Rotations  for  CS  1,  OS  4 


INITn 

INITn 

D 

D 

oau  n 

Force  and  Torque  for  CS  1 ,  OS  4 

oau  n 

179 


Constraint  Set  1  included  H°  =  {max-speed <=  5.5},  S°  =  {somewhat  quickly}. 
The  soft  constraints  were  extended  toS°=  {5.0  m/s  <  max-speed  <  10.6  m/s,  4.0  m/s  < 
avg-speed  <  8.6  m/s}.  These  in  general  were  very  successful.  For  the INIT cases,  only 
two  iterations  were  needed  for  total  success.  (We  did  have  a  problem  with  Obstacle  Set 
4,  where  a  hard  limit  failure  below  our  level  of  precision  forced  24  iterations  when,  in 
effect,  we  were  meeting  our  constraints  exactly.)  In  the  default  cases,  Obstacle  Sets  1 
and  2  reacted  similarly,  increasing  the  time  weight  until  success  was  found.  For  Obstacle 
Sets  3  and  4  in  the  default  cases,  the  results  were  more  similar  to  the  2DOF  Constraint 
Set  1  behavior,  with  the  time  weight  initially  over-adjusted  up,  then  adjusted  back  down 
in  one  or  more  steps. 

Since  path  lengths  varied  from  obstacle  set  to  set,  final  times  were  not  necessarily 
similar  across  the  board.  Similar  speed  results  were  achieved  for  all  cases,  however. 

Obstacle  Set  4  requires  no  translation  in  the  y-direction,  so  there  is  no  force  there 
in  any  of  the  Constraint  Sets. 
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Path  for  CS  2,  OS  1 
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Path  for  CS  2,  OS  2 
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Path  for  CS  2,  OS  3 
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Path  for  CS  2,  OS  4 
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Constraint  Set  2  was  only  a  soft  constraint,  S°  =  {“exceedingly  efficiently”} .  This 
was  extended  to  “low  force”  and  that  was  in  turn  defined  for  this  system  as  2.7  N  <  force 
<  5.4  N.  INIT was  again  very  successful  with  this  set;  for  Obstacle  Sets  1  and  4,  it  solved 
it  with  the  first  weight  vector.  For  OS  2  and  3,  another  iteration  was  needed;  the  time 
weight  (which  was  set  to  0.25  by  INIT)  was  a  little  too  high  and  had  to  be  lowered 
further.  The  default  cases  all  needed  to  adjust  the  time  weight  down,  which  they  did  in 
between  three  and  five  steps.  Default  results  were  not  identical  to  INIT  results,  but  all 
were  very  similar. 

For  OS  1,  2  and  4,  we  had  the  typical  smooth  acceleration  followed  by  a  smooth 
deceleration,  which  is  more  what  we  would  expect  for  a  time-efficient  trajectory. 

Perhaps  the  values  of  “low”  force  were  overestimated  for  smaller  fields;  Obstacle  Set  3 
gives  something  closer  to  a  bang-coast-bang  approximation,  which  is  what  we  would 
expect  to  see  for  an  efficient  trajectory.  Our  WADJ  curves  were  generated  in  50m  cube 
volumes,  which  is  closer  in  size  to  the  field  of  Obstacle  3  than  the  others.  We  reiterate 
that  the  returned  trajectories  met  the  numeric  values  that  defined  the  fuzzy  preference 
“efficiently”  -  WADJ  did  not  fail,  based  on  what  it  knew.  However,  this  result  suggests 
that  its  knowledge  was  faulty  and  that  some  of  our  ad  hoc  definitions  require  more 
refinement. 
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Path  for  CS  3,  OS  1 
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Path  for  CS  3,  OS  2 
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Path  for  CS  3,  OS  3 


INITn 

Defaultn 


■5 

X 


Velocities  and  Rotations  for  CS  3,  OS  3 


INITi 

INITn 

n 

e  au  n 

Force  and  Torque  for  CS  3,  OS  3 

c  au  n 

Time 


-5 

N 


Time 


Time 


Time 


o>  0.2 

I  0 
^  -0.2 

0  1  2  3  4  5  6  7 


Time 


Time 


8. 

I 

>- 


\ 

-0.01  — 


0 


2  3 


4  5  6  7 


Time 


0.05 


Time 


188 


Path  for  CS  3,  OS  4 
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Constraint  Set  3  was  another  soft-only,  S°  =  {“a  little  quickly”}.  I  NIT  was 
generally  more  successful  than  the  default  case  here,  as  it  solved  for  all  obstacle  sets  in  a 
single  iteration  in  three  cases,  and  required  only  two  for  the  fourth.  The  default  approach 
typically  required  three  or  four  for  OS  1-3.  For  OS  4,  it  seriously  over-estimated  the  time 
weight  needed  in  its  second  iteration.  By  the  fourth  iteration,  it  was  back  to  a  more 
reasonable  answer,  but  the  margin  of  failure  was  so  small  that  the  weight  adjustments  to 
the  time  weight  were  also  very  small,  and  the  weight  was  not  lowered  enough  before  the 
time  limit  expired.  The  first  iteration,  with  default  weights,  was  found  to  have  the 
smallest  errors  and  returned;  this  is  the  only  time  in  this  CS  that  the  default  case  found  a 
solution  significantly  different  from  the  INIT  case. 

We  see  again  in  all  the  OS  the  typical  time-efficient  force  curves  that  we  would 
hope  to  see  in  a  “quickly”  preference. 
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Path  for  CS  4,  OS  1 
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Path  for  CS  4,  OS  2 
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Path  for  CS  4,  OS  3 
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Constraint  Set  4  consisted  of  two  hard  numeric  constraints,  H°  =  {max-acc  <  1.3 
m/s",  max-speed <  4.0  m/s},  and  two  soft  numeric  constraints  S  =  {2.7  N  <  force  < 

5.4  N,  1.8  m/s  <  avg-speed  <  4.0}.  The  acceleration  and  force  are  low  for  this  system; 
the  avg-speed  is  medium-low  or  low;  and  the  max-speed  must  be  less  than  medium-high. 
The  constraints  do  not  obviously  conflict,  but  they  will  interact.  We  note  that  this  CS 
failed  to  converge  for  OS  4. 

INIT  selected  weights  that  typically  made  the  max-speed  H°  but  not  the  max-acc. 
The  default  weights  failed  similarly.  In  all  cases,  this  set  up  a  cyclic  solution  cycle. 
Weights  were  lowered  to  meet  max-acc',  typically  on  Iteration  2  both  H°  were  met.  (For 
INIT,  OS  2  required  five  iterations  to  meet  both;  for  default,  OS  3  required  two.)  In  all 
cases  where  both  H°  were  met,  both  S°  failed  low. 

Despite  the  fact  that  most  of  the  time  weights  were  on  the  order  of  0.01,  the  force 
profiles  still  have  the  constant  thrust  characteristic  of  time-efficient  trajectories.  The 
levels  of  thrust  are  greatly  decreased  from  the  prior  constraint  sets,  and  the  time  for 
completion  greatly  extended.  The  velocity  profiles  match  well,  and  the  torques  are  zero 
to  our  level  of  precision. 
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Path  for  CS  5,  OS  1 
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Path  for  CS  5,  OS  2 
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Path  for  CS  5,  OS  3 
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Path  for  CS  5,  OS  4 
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Constraint  Set  5  was  identical  to  CS  4,  with  “moderately  safely”  added  to  S°. 
“Safely”  includes  a  “medium  high  min-sep ”  in  its  definition,  which  is  defuzzified  to  a 
min-sep  between  1.0  and  2.0  m.  Like  CS  4,  the  solution  cycled  back  and  forth  between 
high  and  low  values  for  the  time  weight;  in  this  case,  LIM  was  overall  higher,  leading  to 
the  arcs  seen  in  OS  2  and  OS  3.  This  does  not  affect  the  path  in  OS  1  at  all,  as  it  is 
already  sufficiently  far  away  from  the  obstacle;  the  path  and  trajectory  are  almost 
identical  to  those  found  for  CS  4. 

It  seems  strange  at  first  that  CS  5,  with  more  constraints,  would  converge  for  OS 
4  while  CS  4  would  not.  We  wondered  if  the  greater  min-sep  could  be  the  cause  -  if 
initialized  to  a  higher  value,  might  the  path  have  stayed  farther  away  from  solutions  near 
constraint  boundaries,  which  are  computationally  difficult  for  BVP4C2?  But  the  path  in 
CS  5  shows  no  significant  deflection  as  it  passes  through  the  formation  of  obstacles.  And 
if  this  were  the  case,  then  the  default  case  would  still  have  failed  to  converge,  as  it  began 
with  the  same  LIM  in  all  runs.  We  can  only  assume  some  numeric  fluke  akin  to  a 
symmetry  in  the  problem  space,  where  the  solver  was  faced  with  two  identical-cost 
variations  and  cannot  select  between  them. 
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Path  for  CS  6,  OS  2 


Velocities  and  Rotations  for  CS  6,  OS  2 


- INITn 

- INITn 

- Defaultn 

Force  and  Torque  for  CS  6,  OS  2 

- Defaultn 

■s 

x 


0 


-1 


1 

-§  0 
> 

-1 


1 


S  0 

N 


-1 


O' 

£ 


0.2 

0 

-0.2 


a,  0.2 


o-  -0.2 


0.2 

0 

-0.2 


1 

5 

10 

15 

20 

25 

30 

Time 

1 

5 

10 

15 

20 

25 

30 

Time 

1 — Xx 

1 

5 

10 

15 

20 

25 

30 

Time 

— — — "  ~~ 

— 

- 

5 

10 

15 

20 

25 

30 

Time 

- 

5 

10 

15 

20 

25 

30 

Time 

1 

_ 1 

1 

1 

- 

5 

10 

15 

20 

25 

30 

Time 


201 


Path  for  CS  6,  OS  3 
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Path  for  CS  6,  OS  4 
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Constraint  Set  6  was  to  be  “very  energy-saving,”  a  tenn  we  defined  to  mean 
having  low  force  and  low  torque.  At  first,  we  ran  these  without  commanding  orientation 
changes;  predictably,  little  to  no  torque  was  required.  We  re-ran  the  tests  again,  with  a 
commanded  orientation  change,  and  with  a  bug  in  the  torque  WADJ  rule  fixed  to  get  the 
results  shown  above. 

The  forces  are  not  so  low  as  in  CS  4  and  5  because  the  constraint  is  directly  on 
force  rather  than  on  max-acc;  a  low  max-acc  apparently  corresponds  to  a  very  low  force, 
these  trajectories  aren’t  as  fuel-efficient  as  those.  They  are  considerably  more  so  than  CS 
1  and  3,  which  were  “quick,”  but  are  comparable  to  CS  2,  “exceedingly  efficiently,” 
which  required  low  force  as  well.  For  the  initialized  cases,  OS  4  needed  only  one 
iteration;  OS  1  and  3  required  only  two.  OS  2  had  problems  here  and  in  the  default  case 
using  enough  torque  to  meet  the  low  end  of  the  “low  torque ”  requirement,  and  so 
required  six  iterations,  quitting  when  the  time  limit  was  reached  (five  iterations  after  the 
first).  In  the  default  cases,  OS  1  and  4  required  three  and  six  iterations,  respectively, 
achieving  total  success.  Their  weight  adjustments  were  monotonic,  steadily  increasing 
torque  weight  and  decreasing  time  weight.  As  seen  in  the  Chapter  5  examples,  most  of 
the  benefit  was  derived  in  the  first  correction.  OS  2  caused  the  same  problem  as  in  the 
initialized  case;  the  default  starting  weight  for  OS  3  put  it  in  an  unfortunate  position,  and 
it  got  caught  in  a  iterative  cycle  that  timed  out  after  five  runs. 
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Path  for  CS  7,  OS  1 
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Path  for  CS  7,  OS  2 
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Constraint  Sets  7  and  8  were  soft  constraints  on  torque  only,  at  “low”  and 
“medium,”  respectively.  The  maneuvers  required  little  enough  torque  that  CS  8  had 
difficulties  meeting  those  requirements  at  all;  the  returned  trajectories  were  so  similar  to 
the  CS  7  solutions  that  they  are  not  included  separately  here. 

CS  7  was  very  successful;  all  the  cases  made  the  constraint.  Either  the  first 
solution  worked,  or  it  used  too  much  torque;  the  torque  weight  was  smoothly  adjusted  up 
until  it  the  feature  value  dropped  sufficiently  to  make  the  limit.  In  CS  8,  the  first 
solutions  were  all  too  low,  and  the  weight  was  adjusted  down,  typically  to  a  point  where 
it  was  below  our  granularity  for  detecting  weight  loops;  when  it  was  decreased  again, 
both  attempts  counted  as  zero  and  were  detected  as  a  loop.  OS  1  and  2  failed  for  CS  8, 
while  OS  3  and  4  eventually  succeeded. 
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Appendix  C:  WADJ  Heuristic  Graphs  and  Fuzzy  Rule 
Definitions 


2DOF  WADJ  Heuristics 

In  this  early  WADJ  graph;  the  energy  feature  is  called  U2,  after  its  representation 
as  a  2-norm  in  the  cost  functional.  Data  is  for  an  empty  field  (U2-0)  and  for  1,  2,  and  3 
obstacles  (U2-1,  U2-2,  U2-3).  W,/W2  =  {2'\  2'2,  ...,  22,  23}. 
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This  shows  how  the  energy  heuristic  exponent  changes  with  field  size.  2m,  10m,  and 
20m  square  fields  were  used  (noted  in  the  legend  as  U2-1,  U2-5,  and  U2-10,  because  the 
coordinates  in  each  went  from  (e.g.)  (-5,  -5)  to  (5,  5). 
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Energy  Use  for  Various  Field  Size 
(no  obstacles) 
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This  shows  the  max-speed  heuristic  for  2m,  20m,  and  200m  square  fields.  Again  the 
indices  (max-speed- 1,  etc.)  refer  to  the  corner  coordinates  used  in  the  runs. 
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This  graph  shows  the  avg-speed  heuristic  degrading  in  the  presence  of  obstacles.  This  is 
Figure  10  in  the  text. 


This  graph  shows  the  avg-speed  heuristic  for  different  field  sizes.  This  is  Figure  1 1  in  the 
text,  and  the  field  size  indicators  were  changed  to  reflect  the  actual  field  size  rather  than 
the  local  coordinates.  So  this  shows  comer-to-comer  motion  through  2m,  10m,  20m,  and 
200m  square  empty  fields. 
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avg-speed  for  different  field  sizes 


Min-sep  varying  with  LIM  for  min-sep  =  { 1 ,  3,  5,  7}  and  a  single  obstacle  in  a  10m 
square  field. 
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2DOF  Fuzzy  Rule  Definitions 


Fuzzy  rules  were  generated  from  WADJ  data  in  the  following  fashion: 

For  time-based  features  (e.g.,  everything  besides  min-sep ):  We  took  the  W1/W2  weight 
vector  {2"  ,  2'  ,  2 , 2  }  to  correlate  to  the  midpoints  of  the  fuzzy  levels  {very  low, 

low,  medium  low,  medium,  medium  high,  high,  very  high}.  The  fuzzy  triangles  would 
be  bounded  in  most  cases  by  the  adjacent  weight.  Thus  Wi/W 2  =  1  was  the  “ideal” 
medium,  but  “medium”  could  range  from  V2 1  o  2.  Then  the  feature  values  in  the  WADJ 
data  that  resulted  from  these  weights  were  taken  as  the  values  that  defined  the  feature 
fuzzy  triangles. 

For  min-sep :  We  had  defaulted  to  a  LIM  of  3  in  our  2DOF  work  and  so  that 
became  “medium.”  The  rest  of  the  LIM  levels  were  defined  fairly  linearly  from  there. 
Since  min-sep  and  LIM  are  related  linearly,  but  the  slope  of  the  relationship  is  unknown 
at  initialization,  we  correlated  them  directly  so  that  “high  LIM f”  will  result  in  “high  min- 
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The  center  values  for  W1/W2  should  have  been  {2'3,  2'7,  ...,  22,  23}.  The  centroid 
computation  required  that  they  be  expressed  separately,  as  whole  numbers,  and  divided 
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only  at  the  end  of  the  computation.  Rather  than  express  them  exactly  as  { 1/8,  1/4,  . . ., 
4/1,  8/1},  I  approximated  them  with  ratios  {3/21,3/14,3/7,3/3,7/3,  14/3,21/3}.  While  I 
believe  at  the  time  (late  2003  -  early  2004)  I  had  good  reasons  for  choosing  these 
numbers,  I  today  have  no  idea  why  I  did  so.  Given  the  overall  fuzziness  of  the  fuzzy 
rules,  the  lack  of  precision  did  not  hurt  the  algorithms,  but  this  should  be  fixed  in  any 
future  version  of  the  software. 
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6DOF  WADJ  Heuristics 
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6DOF  Fuzzy  Rule  Definitions 
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